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Abstract 

Compressed  sensing  is  an  important  field  with  continuing  advances  in  theory  and 
applications.  This  thesis  provides  contributions  to  both  theory  and  application.  Much 
of  the  theory  behind  compressed  sensing  is  based  on  uncertainty  principles,  which 
state  that  a  signal  cannot  be  concentrated  in  both  time  and  frequency.  We  develop  a 
new  discrete  uncertainty  principle  and  use  it  to  demonstrate  a  fundamental  limitation 
of  the  demixing  problem,  and  to  provide  a  fast  method  of  detecting  sparse  signals. 
The  second  half  of  this  thesis  focuses  on  a  specific  application  of  compressed  sensing: 
hyperspectral  imaging.  Conventional  hyperspectral  platforms  require  long  exposure 
times,  which  can  limit  their  utility,  and  so  we  propose  a  compressed  sensing  platform 
to  quickly  sample  hyperspectral  data.  We  leverage  certain  combinatorial  designs  to 
build  good  coded  apertures,  and  then  we  apply  block  orthogonal  matching  pursuit  to 
quickly  reconstruct  the  desired  imagery. 
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RECENT  ADVANCES  IN  COMPRESSED  SENSING: 

DISCRETE  UNCERTAINTY  PRINCIPLES  AND 
FAST  HYPERSPECTRAL  IMAGING 

I.  Introduction 

Good  decisions  are  typically  informed  by  relevant  data.  Unfortunately,  many 
settings  exhibit  a  bottleneck  at  the  data-collcction  step,  which  presents  an  issue  in 
the  face  of  urgency.  Over  the  last  decade,  this  limitation  has  spurred  research  in 
speeding  up  data  collection  with  a  developing  theory  called  compressed  sensing. 

The  original  motivation  for  compressed  sensing  research  came  from  applications  to 
medical  imaging,  e.g.,  magnetic  resonance  imaging  (MRI).  Since  the  early  1980s,  MRI 
has  granted  doctors  the  ability  to  distinguish  between  healthy  tissue  and  cancerous 
tumors  without  the  need  for  invasive  surgical  procedures.  Unfortunately,  MRI  ma¬ 
chines  take  a  significant  amount  of  scanning  time  to  produce  high-resolution  images, 
which  are  often  necessary  for  diagnostics.  The  longer  a  patient  is  required  to  remain 
still,  the  more  likely  she  is  to  move,  which  causes  irreparable  distortions  in  the  final 
image.  In  the  last  decade,  Candes,  Romberg  and  Tao  [1]  showed  that  high-resolution 
images,  similar  to  those  produced  using  MRI,  can  be  reconstructed  from  very  few 
measurements,  which  in  turn  reduces  the  time  required  to  collect  the  necessary  data. 
To  achieve  this  speedup,  Candes  et  al.  leveraged  the  fact  that  the  desired  image  is 
nearly  sparse  in  the  wavelet  domain  [2], 

More  generally,  compressed  sensing  seeks  to  solve  the  linear  system  <Pa;  =  y,  where 
<f>  G  CMxN  with  M  <C  N .  Of  course,  since  M  <C  N,  this  underdetermined  linear 
system  is  impossible  to  uniquely  solve  using  traditional  linear  algebra.  However,  if 
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Figure  1.  (a)  The  Shepp  Logan  phantom  test  image,  (b)  White  pixels  denote  sample 
locations  of  (a)  in  the  Fourier  domain,  (c)  The  test  image  projected  onto  the  span  of 
the  sampled  Fourier  modes,  (d)  Reconstruction  of  (a)  by  minimizing  total  variation 
subject  to  the  Fourier  samples.  Perhaps  surprisingly,  the  reconstruction  is  exact  [1]. 

we  make  additional  assumptions  about  x ,  they  might  be  combined  with  the  linear 
data  to  uniquely  determine  the  original  vector  x.  For  example,  one  might  assume  x 
is  K -sparse,  i.e.,  at  most  K  entries  of  x  are  nonzero.  In  many  settings,  this  is  a  valid 
signal  model;  for  example,  JPEG2000  exploits  the  fact  that  natural  images  are  nearly 
sparse  in  the  wavelet  domain  [3,  4], 

For  a  demonstration  of  compressed  sensing,  see  Figure  1.  Here,  an  image  is 
sampled  in  only  a  few  locations  in  the  Fourier  domain.  Traditionally,  one  might 
project  the  image  onto  the  span  of  these  Fourier  modes  to  get  Figure  1(c),  thereby 
producing  significant  artifacts  that  would  prevent  proper  diagnostics.  However,  the 
image  can  be  exactly  recovered  by  instead  minimizing  a  certain  convex  objective 
function  subject  to  the  Fourier  samples.  This  is  the  compressed  sensing  solution 
that  delivers  high-resolution  imagery  from  very  few  measurements.  To  see  why  this 
works,  first  note  that  the  sensing  matrix  must  be  special  in  order  to  provide  complete 
information.  Indeed,  if  x  is  sparse  but  the  rows  of  <F  happen  to  be  orthogonal  to  x , 
then  <Fa:  fails  to  distinguish  x  from  the  zero  vector.  This  problem  can  be  averted  if  <F 
satisfies  the  (2 K ,  5)-restricted  isometry  property  (RIP),  that  is,  if 

(i-*)IMIl<M<  (1  +  *)M! 


2 


for  every  2A'-sparse  x.  The  main  result  in  compressed  sensing  is  that  solving 

argmin  ||x||i  subject  to  <f>a;  =  y 

is  equivalent  to  finding  the  sparsest  x  such  that  =  y  provided  <f>  satisfies  the 
(2AT,  <5)-RIP  with  5  <  y/2  —  1  and  the  sparsest  such  x  is  K -sparse  [5].  Thus,  not 
only  do  RIP  matrices  provide  complete  information  about  any  A'-sparse  vector, 
they  also  allow  for  reconstruction  by  l\  minimization,  which  can  be  implemented 
using  linear  programming.  Furthermore,  if  <f>  is  a  random  matrix  (e.g.,  <f>  has  iid 
Gaussian  entries),  then  with  high  probability,  <f>  satisfies  the  (2AT, 5)-RIP  provided 
K  =  Os(M/  polylog  At)  [6].  Therefore,  the  number  of  measurements  required  to  effi¬ 
ciently  recover  x  scales  linearly  with  the  signal  complexity  K  and  does  not  significantly 
depend  on  the  size  N  of  the  signal. 

The  purpose  of  this  thesis  is  to  further  develop  the  theory  of  compressed  sensing 
and  related  topics.  In  particular,  we  provide  a  new  uncertainty  principle  (which 
quantifies  the  sparsity  level  of  any  function  relative  to  its  Fourier  transform)  and  we 
also  demonstrate  the  applicability  of  compressed  sensing  theory  to  fast  hyperspectral 
imaging.  The  following  section  details  the  applicability  of  these  results. 

1.1  Applications  of  uncertainty  principles 

Uncertainty  principles  have  been  studied  with  increasing  frequency  for  over  the 
last  sixty  years  [7].  The  most  basic  form  of  the  uncertainty  principle  is  the  follow¬ 
ing  assertion:  A  signal  cannot  be  highly  concentrated  in  both  time  and  frequency, 
illustrated  in  Figure  2.  In  particular,  a  Fourier  transform  pair  x  and  x  cannot  be 
simultaneously  sparse  [7,  8,  9].  It  wasn’t  until  more  recently  that  uncertainty  prin¬ 
ciples  were  applied  to  sparse  signal  recovery  problems,  most  notably  by  Donoho  and 
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1.2 


1.2 


(a)  (b) 


Figure  2.  (a)  The  function  /( x)  =  exp(— 10x2).  (b)  The  Fourier  transform  of  /,  h(u>)  = 
exp(— w2/40).  Notice  that  /  is  narrow  while  h  is  wide.  The  uncertainty  principle 
contends  that  no  function  is  simultaneously  narrow  in  both  the  time  and  frequency 
domains. 


Stark  in  [8]. 

In  this  thesis,  we  develop  a  new  version  of  the  uncertainty  principle  for  discrete 
functions  and  discuss  its  applications  to  demixing  problems  and  to  the  fast  detection 
of  sparse  signals.  We  briefly  describe  these  problems  below: 


The  demixing  problem. 

Digital  signals  are  often  corrupted  by  noise  when  transmitted.  Thankfully,  it 
is  sometimes  possible  to  distinguish  the  noise  from  the  desired  signal.  Assume  the 
desired  signal  x  is  sparse  in  the  Fourier  domain.  If  x  has  been  corrupted  by  noise  e 
that  is  sparse  in  the  identity  basis  (think  speckle),  then  we  will  observe  z  =  x  +  e. 
The  goal  of  demixing  is  to  separate  x  from  e.  In  [10],  Tropp  showed  that  demixing 
can  be  accomplished  by  solving 


[I  F]v  =  Fz. 


(1) 
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The  desired  solution  v*  to  (1)  is  a  concatenation  of  Fx  and  e,  i.e., 


v 


* 


Fx 

e 


Since  v*  is  sparse,  this  problem  may  be  solved  using  ideas  from  compressed  sensing. 
In  particular,  v*  is  likely  the  1-norm  minimizcr  subject  to  (1).  However,  suppose  there 
is  a  signal  /  that  is  sparse  in  both  the  identity  and  Fourier  bases.  Then  F(x  +  /)  and 
e  —  /  are  both  sparse.  Therefore, 


F(x  +  f) 

e  -  / 


is  also  a  sparse  solution  to  (1).  As  such,  we  cannot  guarantee  recovery  of  v *  using 
compressed  sensing  methods  because  v*  is  not  the  only  sparse  solution. 

Indeed,  we  need  to  understand  how  sparse  a  function  can  be  in  both  the  iden¬ 
tity  and  Fourier  bases  in  order  to  establish  the  inherent  limitations  on  the  demixing 
problem.  This  is  one  motivation  behind  our  study  of  uncertainty  principles. 


Fast  detection  of  sparse  signals. 

Signal  detection  is  a  well-studied  problem  with  a  variety  of  applications.  The  goal 
of  signal  detection  is  to  distinguish  between  an  information-bearing  signal  and  mere 
noise,  making  it  a  critical  component  of  signals  intelligence.  Ideally,  detection  should 
be  performed  as  quickly  as  possible  so  as  to  promptly  initialize  further  processing 
and  enable  a  timely  decision.  Hence,  it  is  desirable  to  use  as  little  computation  as 
possible  to  detect  an  information-bearing  signal.  Assume  that  a  signal  x  is  Jl-sparse. 
Then  by  the  uncertainty  principle,  Fx  is  not  highly  sparse,  and  so  only  a  few  random 
samples  in  the  Fourier  domain  should  suffice  to  detect  the  signal.  In  Chapter  III,  we 
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will  show  that  only  0(K )  random  samples  are  required  to  detect  a  sparse  signal,  and 
our  detection  method  requires  only  0(K  log  TV)  time. 

1.2  Applications  to  fast  hyperspectral  imaging 

A  standard  digital  camera  captures  and  stores  images  using  only  red,  green  and 
blue  light  channels.  Combinations  of  these  frequencies  of  light  are  then  used  by 
computer  screens  to  display  photographs  for  human  consumption.  Hyperspectral 
imaging  involves  collecting  data  at  multiple  wavelengths  rather  than  just  these  three. 
The  advantage  of  collecting  multiple  spectral  bands  is  the  ability  to  glean  information 
about  the  chemical  composition  of  the  observed  object  [11].  As  a  defense  application, 
hyperspectral  imaging  is  particularly  important  for  analyzing  the  composition  of  det¬ 
onations  in  war  zones  [12].  With  this  application  in  mind,  we  seek  to  observe  brief  and 
localized  events.  The  fact  that  the  event  is  brief  is  a  disadvantage  since  conventional 
hyperspectral  imaging  platforms  might  not  have  time  to  capture  the  desired  image. 
On  the  other  hand,  the  advantage  of  localized  events  is  that  they  are  spatially  sparse, 
allowing  for  compressed  sensing  to  possibly  overcome  the  event’s  brevity. 

Assume  X  =  f(x,y,  A)  is  the  data  cube  we  would  like  to  observe,  where  x  and 
y  are  the  spatial  coordinates  and  A  is  a  spectral  coordinate,  often  referred  to  as  a 
spectral  band.  Conventional  remote  sensing  platforms  do  not  collect  hyperspectral 
data  without  some  sort  of  scanning  method.  The  three  main  scanning  methods  are 
spectral,  point,  and  line  scanning  [13].  Spectral  scanning  involves  the  platform  only 
obtaining  information  for  a  particular  spectral  band  at  a  time.  Point  scanning  (or 
whisk-broom)  methods  collect  all  light  channels  at  one  point  in  space  at  a  time.  Line 
scanning  (or  push-broom)  methods  observe  an  entire  line  in  space  at  a  time,  mov¬ 
ing  across  space.  Therefore,  this  method  collects  two  dimensional  images  with  each 
measurement,  one  spatial  dimension  and  a  spectral  dimension  [13].  The  disadvantage 
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X 

Figure  3.  Example  of  data  cube  of  hyperspectral  imagery  of  stars.  The  spatial  location 
of  a  star  is  represented  with  x-  and  y-coordinates.  The  A  axis  represents  the  spectral 
bands  of  the  data  cube.  The  data  cube  is  sparse  since  the  distribution  of  stars  in  the 
night  sky  is  sparse  and  spectral  bands  are  zero  throughout  empty  space.  Image  of  stars 
from  [14]. 


with  each  of  these  scanning  methods  is  that  they  require  a  large  amount  of  time  to 
capture  the  data  cube  A"  [11], 

Suppose  A"  is  a  hyperspectral  image  of  the  night  sky  as  shown  in  Figure  3.  We 
know  that  stars  are  sparsely  distributed  throughout  space,  and  so  we  may  assume 
that  X  is  sparse.  As  such,  compressed  sensing  methods  might  allow  us  to  bypass  the 
limitations  of  conventional  scanning.  Along  these  lines,  an  alternative  hyperspectral 
imaging  model,  proposed  by  [15],  uses  a  micro-mirror  array  (MMA)  to  capture  the 
entire  data  cube  instantaneously  (see  Figure  4).  Specifically,  the  MMA  either  reflects 
all  light  at  a  point  in  space  through  a  prism  onto  a  charge-coupled  device  (CCD)  or 
reflects  the  light  away  from  the  CCD. 

Given  compressive  measurements  of  the  data  cube,  we  need  to  apply  a  reconstruc¬ 
tion  algorithm.  To  this  end,  we  will  choose  an  algorithm  that  exploits  the  additional 
structure  of  the  desired  data  cube.  Indeed,  notice  in  Figure  3  that  the  nonzero  en- 
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Figure  4.  Example  of  a  compressive  remote  sensing  platform.  Each  exposure  uses  a 
different  configuration  of  the  micro-mirror  array  (MMA).  A  mirror  in  the  MMA  either 
reflects  all  light  towards  the  prism  or  reflects  all  light  away  from  the  prism.  After  the 
light  is  dispersed  by  the  prism,  it  is  collected  on  the  charge-coupled  device. 

tries  of  the  data  cube  are  in  batches  with  common  spatial  coordinates.  We  call  these 
batches,  blocks  and  we  say  the  data  cube  exhibits  block  sparsity.  Block  sparsity  is  more 
informative  than  regular  sparsity,  and  recovery  algorithms  that  use  this  information 
tend  to  perform  better  [16,  17].  One  such  recovery  algorithm,  developed  in  [16],  is 
block  orthogonal  matching  pursuit  (BOMP).  As  the  name  suggests,  this  algorithm  is  a 
block  version  of  a  sparsity-based  reconstruction  algorithm  called  orthogonal  matching 
pursuit  (OMP).  In  Chapter  IV,  we  apply  this  algorithm  to  simulated  data  cubes  of 
star  data  to  demonstrate  the  plausibility  of  data  cube  reconstruction  from  few  hyper- 
spectral  exposures.  In  particular,  our  simulations  suggest  that  one  can  reconstruct 
data  cubes  like  the  one  depicted  in  Figure  3  with  less  than  a  third  of  the  exposures 
required  by  conventional  scanning  methods. 


1.3  Outline 


This  introduction  serves  to  build  the  context  of  the  theory  developed  in  Chap¬ 
ter  II,  as  well  as  the  real-world  problems  that  will  benefit  from  our  results.  In  Chap¬ 
ter  II,  we  will  discuss  the  0-norm  uncertainty  principle  and  give  a  concise  proof  of  the 
Donoho-Stark  uncertainty  principle  developed  in  [8] .  Additionally,  we  introduce  and 
characterize  equality  in  a  new  mixed-norm  uncertainty  principle.  Lastly,  we  derive  a 
Fourier  transform  pair  that  achieves  near  equality  in  the  Donoho-Stark  uncertainty 
principle  as  well  as  the  mixed-norm  uncertainty  principle. 

In  Chapter  III,  we  consider  two  applications  of  the  mixed-norm  uncertainty  princi¬ 
ple.  First,  we  show  that  demixing  problems  cannot  break  the  square-root  bottleneck, 
i.e.,  in  the  worst  case  one  can  only  stably  demix  K -sparse  signals  if  K  =  0(y/rN ), 
where  N  is  the  signal  dimension  [10,  18].  Second,  we  show  how  to  detect  K -sparse 
signals  from  only  O(K)  measurements. 

In  Chapter  IV,  we  focus  on  the  hyperspectral  imaging  problem.  First  we  use 
combinatorial  designs  to  devise  a  sequence  of  micro-mirror  orientations  that  allow 
us  to  reconstruct  the  desired  hyperspectral  imagery  with  block  orthogonal  matching 
pursuit.  Second,  we  provide  simulation  results  that  demonstrate  the  feasibility  of 
compressive  hyperspectral  imaging. 

We  conclude  in  Chapter  V  with  ideas  for  future  work. 
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II.  Discrete  uncertainty  principles 


In  this  chapter,  we  discuss  three  discrete  uncertainty  principles.  We  begin  with 
a  proof  of  the  0-norm  uncertainty  principle  [19]  and  a  new  efficient  proof  of  the 
Donoho-Stark  uncertainty  principle  (Theorem  2  in  [8]).  Next,  we  discuss  a  new 
uncertainty  principle  that  we  call  the  mixed-norm  uncertainty  principle.  Later  we 
prove  a  surprising  result  that  equality  in  the  mixed-norm  uncertainty  principle  is 
achieved  precisely  when  equality  is  achieved  in  the  0-norm  uncertainty  principle. 
Lastly,  we  show  that  a  discretized  and  periodized  Gaussian  achieves  near  equality 
in  both  the  Donoho-Stark  uncertainty  principle  and  the  mixed-norm  uncertainty 
principle. 

2.1  Background 

Before  we  begin,  we  will  introduce  some  notation  and  definitions.  Let  G  be  an 
abelian  group  and  let  T  denote  the  complex  unit  circle.  Define  a  character  of  G  to 
be  a  function  y  :  G  — »  T  such  that 


x(9i  +  92 )  =  x(gi)x(92)  V01,  g2  e  G. 


(2) 


Let  G  be  the  set  of  all  homomorphisms  y  that  satisfy  (2).  The  dual  of  G,  namely 
G,  is  also  an  abelian  group  (see  Appendix  1.1).  Let  £(G)  be  the  Hilbert  space  of  all 
functions  from  G  to  C.  We  denote  the  Fourier  transform  of  a  function  /  G  £(G)  by 
/  G  £(G).  We  formally  define  the  Fourier  transform  from  £(G)  to  £(G)  below. 

Definition  1.  The  Fourier  transform  F:  1(G)  — >  £(G)  and  its  inverse  are  given  by 


(J7)M  - 


/TpXl  ^  ^ 
V  ATI  g&G 


(F~lh)[g) 


/ 1  . — y  I  ^  >  Mxlxlfi1]) 
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respectively. 


Another  useful  definition  is  the  support  of  a  function,  denoted  supp(-).  The  sup¬ 
port  of  a  function  /  on  its  domain  D  is  defined  as 

supp (/)  :=  {x  G  D  :  f(x)  ±  0}. 

The  0-norm  is  then  given  by  ||/||0  =  |supp(/)|.  We  note  that  the  0-norm  is  not 
technically  a  norm,  since  it  violates  homogeneity. 

2.2  The  0-norm  uncertainty  principle 

In  this  section,  we  introduce  the  0-norm  uncertainty  principle  (see  [19]).  Through¬ 
out  this  thesis,  we  will  assume  that  \G\  =  N . 

Theorem  1.  Given  an  abelian  group  G  of  size  N  and  f  G  £(G),  we  have 

|supp(/)||supp(/)|  >  N. 


Proof.  By  the  triangle  inequality  and  Definition  1, 


max  |  f[x]  |  < 
X6  G 


^S/l9' 


g&G 


For  any  t/eC,  dehne  sgn(y)  as 


sgn (y)  ■= 


if  y  ±  o 

if  y  =  0. 


(3) 


11 


We  know  that  |/[.g]|2  =  f[g\f[g\-  Dividing  by  \f[g]\  yields 


\f[9}\  =  Mtjtt]  =  /b]sgn/[^]. 


Substituting  (4)  into  (3)  gives 


max  f\x]  <  -L^^[^]sgn/b]  =  4^(/,sgn/)  <  -^=||/||2  ||sgn/||2 ,  (5) 

xeG  v  iV  “  VN  VN 


where  the  last  inequality  is  true  by  the  Cauchy-Schwarz  inequality.  By  definition  of 
||  •  || 2,  we  have 


2  ||sgn/||2  =  ||/||2  (  | sgn  f[g]\2  =  ||/||2  |supp(/)p  =  ||/||2  |supp(/)|*  ,  (6) 


where  the  last  equality  is  a  result  of  Plancherel’s  theroem.  Note  that  for  all  y  G  d(G), 
\\y\\2  <  |  supp(y)|1/2|||/||0O.  Therefore, 


2  <  |  supp(/)|1/2 max  |/[x]|. 

X6  G 


Combining  (5),  (6),  and  (7)  yields 


max|/[x]|  <  —7=  max  |/[x]||  supp(/)|J/2|  supp(/)|1/2. 
xeG  VN  X6 G 


Taking  the  square  and  rearranging  then  gives 


supp(/)||supp(/)|  >  N. 


We  will  study  two  robust  versions  of  Theorem  1.  First  is  the  uncertainty  principle 
introduced  by  Donoho  and  Stark  [8] .  The  second  theorem  is  in  terms  of  a  numerically 
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robust  analog  of  the  0-norm  called  numerical  sparsity  (see  Definition  3). 


2.3  The  Donoho— Stark  uncertainty  principle 


In  this  section,  we  introduce  the  Donoho-Stark  uncertainty  principle  and  provide 
a  more  efficient  proof  than  the  one  provided  in  [8].  Define  T  C  G,  W  C  G  and  let 
Pt,Pw  '■  £(G)  —■ ►  P(G)  be  time-  and  frequency-limiting  operators  dehned  by 


(Pt!)  W 


/[*]  if  t  e  T 
0  otherwise 


and 

(■ pwf )  [t]  ■=  -4=  e^/M 

wew 

respectively.  The  projection  operator  P t  removes  any  part  of  the  function  /  that  is 
supported  outside  the  index  set  T.  Similarly,  Pw  filters  /  so  that  /  is  supported  on 
the  index  set  W. 


Theorem  2  (Donoho-Stark  [8]).  Let  G  be  an  abelian  group  of  size  N  and  suppose 
f  G  £(G)  is  concentrated  in  both  time  and  frequency: 


11/  ~  Prf  || 2  <  e-rll/lb,  11/  —  Pwf  II 2  <  Dvll/lh, 


for  some  6t,  >  0.  Then 


\T\\W\>N(l-(eT  +  ew))2. 

Proof.  Assume  without  loss  of  generality  that  ||/||  =  1.  By  the  triangle  inequality, 


1  -  \\PwPTf\\  <||  /  -  PwPrf  ||  <||  /  -  Pwf  ||  +  ||  Pw(f  ~  Prf)  II-  (8) 
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Since  Pw  is  a  projector, 


11/  —  Pwf  II  +  II Pw(f  ~  Prf) II  <  11/  ~  Pwf  II  +  11/  —  -Ft/ 1 1  <  £\v  +  ^t-  (9) 

Combining  (8)  and  (9),  and  then  rearranging  produces 

1  —  (ew  +  ct)  <  II-FW-Ft/H  <  1 1 FV-Fr 1 1 2-5.2  <  H-FW-FtIIf, 

where  the  second  inequality  follows  from  the  definition  of  the  induced  norm  and  || - 1| 
denotes  the  Frobenius  norm.  We  claim  || Av’-Frj! f  —  |W||T|/iV,  which  implies  the 
result. 

To  prove  our  claim,  define  Dg  diag(ls),  where  I5  denotes  the  indicator  function 
of  the  set  S.  Note  that  Pt  =  Ft  and  Pw  =  F~1DwF ■  Therefore, 

IIfv^tIIf  =  WF^DwFDrWl  =  \\dwfdt\\2f  =  ]T|F[i,j]|2  = 

i&W 

j&T 

completing  the  proof.  □ 

2.4  Near  equality  in  the  Donoho— Stark  uncertainty  principle 

In  this  section,  we  show  that  a  discretized  and  periodized  Gaussian  function 
achieves  near  equality  in  the  Donoho-Stark  uncertainty  principle.  In  particular,  we 
have  the  following  result: 

Theorem  3.  For  every  e  >  0,  there  exists  C  >  0  such  that  for  every  sufficiently  large 
N,  there  exists  f  G  F(Zjv),  T  C  Zat,  W  C  Zat  such  that 

||/-FV||2<e||/||2,  \\f-Pwfh<e\\f\\2 
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(a) 


(b) 


Figure  5.  (a)  An  example  of  a  Gaussian  function  f(x)  =  e  N'nx  ,  with  N  =  81.  (b) 

The  Gaussian  function  from  (a)  sampled  at  every  multiple  of  1  /N  and  periodized.  This 
function  is  an  eigenvector  of  the  discrete  Fourier  transform,  and  it  has  around  V  N  large 
entries.  We  leverage  these  facts  to  demonstrate  near  equality  in  both  the  Donoho  Stark 
and  mixed-norm  uncertainty  principles. 


and 


\T\\W\  <  CN  log N. 


(10) 


Such  a  construction  will  demonstrate  that  Theorem  2  is  nearly  tight  for  all  suffi¬ 
ciently  large  N  (see  Figure  5).  Before  providing  the  construction  rigorously,  we  begin 
with  a  definition: 


Definition  2.  Let  Schwarz  space  be  the  set 


S 


C c 


sup 


xa  f^\x)  |  <  oo  Va,/3 


Theorem  4.  For  all  f  6  S,  if 


Mi]  =  £/(^+/), 

j'ez  v  7 
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then 


c FNh)[k}  =  VNYJf(k  +  k'N ), 

fc'GZ 

where  FN  denotes  the  Fourier  transform  on  7LN. 

Proof.  First,  by  the  definition  of  Fn 

ViV  ieZjv  \j'ez  v  7/ 

Apply  Corollary  1  (Appendix  1.2)  and  rearrange  to  get 

(FNh)  M  =  -j=  Yi  (j2^f)[q}e2’,iMN )  e-2"*/" 

*  7V  jeZ;v  \<?ez  / 

=  E  (e^-^y . 

Then  the  geometric  sum  formula  (see  Claim  9  in  Appendix  1.4)  implies 

E  (e^-^y  =  Vn  V  (Jk/)M 

*  gEZ  iGZjv  q=k+N7j 

Let  k!  (q  —  k)/N .  By  change  of  variables,  we  are  done: 

(. fnh )  [k}  =  ^Nj2  w)  (k + k'Ny  d 

k'ez 

Armed  with  Theorem  4  and  the  background  information  in  Appendix  1.3,  we  can 
find  the  Fourier  transform  pair  we  seek: 

Theorem  5.  If  fs(x)  :  =  ,  then  (Ff)s(£)  :=  /i(f). 

Proof.  By  the  definition  of  fs,  we  know  that  for  some  function  h, 


/,K)  =  F(fc0)=S(f7O(»O 
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where  the  last  equality  is  due  to  Claim  8.  By  definition  of  h,  we  get 


s(J7>)(sO  =  =/!/,({)■ 


□ 


Given  the  Fourier  transform  pair  fs  and  j\  /s  and  Theorem  4,  we  can  derive  the 
following  Fourier  transform  pair. 


Lemma  1.  Denote 

hs[n ]  :=  Y  fa  +  n)  . 

n'ez 

Then,  Fhs\n]  =  hj_\n\. 

L  J  Ns  1  J 

Proof.  Given  fs  and  the  definition  of  hs, 


hs[n\  :=  Y 

n'^Tj 


Theorem  4  and  Lemma  5  then  give 


Fhs[n]  :=VnY  fs(n  +  n'N)  =  Y  +  n'N )■ 

n'EZ  n'EZ 


By  definition  of  /i/s 


V7V  Y  ^  ^  Vse“w(s(n+JVn'))2 

n'ez 

=  21*Vn~sY  e~<Ns^+n'))2  =  h_i_[n]  □ 

n'EZ 


Let  be  defined  as  follows: 


/i(A)[n]  :  = 


—  7T 

e 


n+n/iV  \ 

*  y 


2 


(ii) 
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By  Lemma  1,  we  can  see  that  that  the  Fourier  transform  of  /dA)  is  simply  hfNiKh 
The  following  facts  about  will  be  useful  for  proving  near  equality. 


Lemma  2.  Define  hfK'>  as  in  (11).  Then 


ii/^iii  >  ^  -  £ 


Proof.  First,  we  expand  \w\2  =  ww  to  get 


iiftmu?  =  E 


n'GZ 


n-\-n  N 


N 

K 


EEE' 

neGn'GZn"GZ 


n+nf  N  I  f  n  +  n,/ n'1^ 
I<  I< 


Since  all  of  the  terms  in  the  sum  are  nonnegative,  we  may  infer  a  lower  bound  by 
discarding  the  terms  for  which  n"  ^  n! .  This  yields  the  following: 


N 


— 2^'  k  )  =hL^e-2vr^>_ 


n&G  rieZ 


K 


mGZ 


2  N  /  r  °° 
K 


e~2n^dx  -  1  , 


where  the  last  inequality  follows  from  an  integral  comparison.  The  result  then  follows 
from  computing  the  integral.  □ 

Lemma  3.  If  hfK]  is  defined  as  in  (11)  and  0  <n<n'<  y,  then  /bA)[n]  >  h^[n']. 


Proof.  It  suffices  to  show  that  e  n\K>  >  e 


+  e  v  K  '  ,  which  is  true  if 


(  n-|-l 


'  NJ_  2 


e  >e  d  ^  ^  +e  Dvr*/  .  De&ie  a  :=  e  ^ .  Then  it  suffices  to  show  that 


P2  >  a(n+1)2  +a(^)2. 


Dividing  both  sides  by  an  yields 


1  >  a2n+1  +  a(^)“n2  =  a2n+1  +  a(^"n)(^+n)  >  a  +  al 
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The  last  inequality  is  from  the  fact  that  2n  +  1  >  1  and  (y  —  n)  (y  +  nj  >  y. 
The  above  implies  that  e-^'  =  a  >  1,  which  is  true  because  of  our  assumption  that 
K  >4.  □ 

We  are  now  ready  to  prove  the  main  result  of  this  section. 

Proof  of  Theorem  3.  Take  /  =  /r(A)  with  K  =  y/N,  and  so  f  —  f.  It  is  then  reason¬ 
able  to  take  T  =  W  and  =  e^.  With  this,  in  order  to  show  that  Theorem  2  is  tight 
for  hfK\  it  suffices  to  find  T  such  that  (10)  is  satisfied  and  ||//A)  —  PTh^\\  <  /?/A ^  || . 

We  will  find  a  p  >  0  such  that 


f[p\  <  5  (12) 

for  some  <5.  Then  T  =  {n  E  ZW  :  |n|  <  p}.  By  Lemma  3, 

f[n\  <  f\p\  <5  Vn(fT 

and  so 

11/  -  Prf\\2  <{N-  \T\)S2  <  N52  <  252\\f\\2, 

where  the  last  inequality  follows  from  Lemma  2  and  the  assumption  that  N  >  16. 
Thus,  ||/  -  PTf\\  <  eT\\f\\  for  eT  =  y/28. 

Also,  by  our  choices  of  W,  /,  and  ew  we  have 

||/  -  Pwf\\  =  || /  -  F-'PrFfW  =  \\F~\Ff  -  PTF f) ||  =  || Ff  -  PTFf  ||. 

We  know  that  /  =  F f.  Therefore, 


11*7  -  PrFf\\  =  || /  -  Prf  ||  <  eT\\f\\  =  ew\\f\\. 
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As  such,  /  will  satisfy  the  hypotheses  of  Theorem  2.  If,  in  addition,  we  find  a  small 
p  satisfying  (12),  then  we  will  have  |T||W|  =  (2 p  —  l)2  small  as  well,  so  Theorem  2  is 
nearly  tight  for  this  choice  of  /. 

By  Lemma  3  and  symmetry,  VK^\p\  >  V K>  [n]  Vn  e  {/j, . . . ,  IV  —  p}.  Thus,  it 
suffices  to  show  a  smallest  p  possible  such  that  V^lp]  <  <5  and  p  =  O(K).  In  order 
to  ford  p,  we  will  ford  an  upper  bound  on  /?dA  ^  in  terms  of  n  and  then  set  n  =  p. 


( 


=  \  w 


E 

n'  EZ 

n+ri/ N 


n+n  N 
K 


>t 


\ 


+ 


E 

n'EZ 

I  n+re/iV 
K 


n-\-n  N 


<t 


(13) 


Assume  t  >  y/N.  We  will  bound  each  term  in  (13)  separately,  beginning  with  the 
first  term.  Let  n  =  ay/N ,  where  a  <  \VN  ■  Then, 


E 

n'EZ 

|  n-\-ri  N  |  -v>  + 

I  K  \  — 


n-\-nf  N  \  ^ 

K  ) 


7 r(a-\-n,\/rN)2 


n'EZ 

a+n'y/N  |  >t 


1 

y/N 


n'eZ 

a-\-nf  y/N\>t 


By  our  assumption  that  t  >  y/N,  the  above  is  bounded  by 


1 

y/N 


1 

Vn' 


Now,  we  will  bound  the  second  term  in  (13).  Because  a  <  |  VN,  e  is 

maximized  for  n'  =  0.  Assume,  t  =  ay/N.  Then, 


n  +  n'N 
K 


a  +  n' y/N |  <  t  —  ay/N . 
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Because  a  >  0,  the  above  implies  |1  /  y/N  +  n'/a\  <  1.  Therefore 


1  n' 

-1  <  — =  H - <  1. 

\[N  a 


Solving  the  inequality  for  n',  we  get 


-a  I  1  H - 7=^  <  n!  <  —a  (l  + 


y/N 


y/Nj  ' 


Therefore,  \n’\  <  (1  +  o(l))<x  Hence,  the  second  term  in  (13)  is  bounded  by 


#{n'  :  | a  +  n'\fN\  <  t}  ■  ei7r“2  <  3ae" 


Thus,  the  upper  bound  for  (13)  is: 


Vn 


+  3ae_7r“2  1  =  N~1/4  +  iY1/43ae"™2 


(14) 


Assuming  A^1/4  <  5/2,  we  will  find  a  lower  bound  on  a  such  that  the  above  is  less 
than  5.  Therefore,  we  need  to  find  an  a  such  that  lV1/43a:e-7rQ2  <5/2.  We  know  that 
ae-7r“2  <  e"2e-7r“2.  Therefore,  it  suffices  to  find  an  a  such  that 


(1  —  7t)q:2  <  log 


5 


6IV1/4 ;  • 


Thus,  we  need  a  such  that 


a  > 


1  /  61V1/4  y 1/2 

log(~J. 


7T  —  1 


Therefore,  for  any  fixed  5,  we  have  h(-K'l[C(5)N1^2  log1//2  N]  <  5  for  all  sufficiently 
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large  N.  Thus, 


\T\\W\  <  CNlogN 


as  desired. 


□ 


2.5  A  mixed-norm  uncertainty  principle 

In  this  section,  we  define  numerical  sparsity  and  use  it  to  formulate  a  new  mixed- 
norm  uncertainty  principle.  We  also  provide  three  different  proofs  of  this  uncertainty 
principle. 

Definition  3.  We  define  the  numerical  sparsity  of  x  e  £(G),  denoted  ns(x),  as 

II  mi  2 

/  \  x  i 

nsW  :=  Wap- 

The  concept  of  numerical  sparsity  is  introduced  in  [20]  as  a  stable  lower  bound  on 
the  sparsity  of  an  unknown  signal,  which  we  prove  below.  This  property  of  numerical 
sparsity  is  useful  in  the  context  of  our  discussion  of  compressed  sensing. 

Lemma  4.  Let  f  e  d(G).  Then  ns (/)  <  ||/||o- 

Proof.  By  the  Cauchy-Schwarz  inequality  we  know  that 

ll/lli  =  (sgn (/),/)  <  ||  sgn(/)||2||/||2  =  x/ll/lloll/lb- 


Rearranging  and  Definition  3  then  give 


ns  (/)  = 


< 


ll/ll 


o, 


thus,  proving  the  statement. 

Using  Definition  3,  we  have  the  following  uncertainty  principle. 


□ 
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Theorem  6  (Mixed-norm  uncertainty  principle).  Given  an  abelian  group  G  of  size 
N  and  f  e  i(G),  we  have 

ns (/)  ns (/)  >  iV.  (15) 


Notice  that  by  Lemma  4,  the  0-norm  uncertainty  principle  is  immediately  implied 
by  the  mixed-norm  uncertainty  principle.  We  have  two  similar  proofs  of  Theorem  6 
(both  of  which  will  be  used  later  to  characterize  equality  in  the  uncertainty  principle) 
and  a  third  very  different  proof  that  uses  interesting  techniques.  We  begin  with  the 
two  similar  proofs. 


Proof  1  of  Theorem  6.  Without  loss  of  generality,  assume  ||/||2  =  1. 
inequality, 


ns (/)  ns (/)  = 


> 


2 

oo 


2  ' 
oo 


where  the  last  equality  is  clue  to  Plancherel’s  theorem. 
Additionally,  we  know  that 


By  Holder’s 


>c  .  WFfW* 

-  <  sup  — — 

i  m  || / ||i 


11*111-00 


1 

Wr 


where  F  is  the  Fourier  transform  operator  and 
Therefore 


1 1 — s*-oo 


is  the  induced  norm. 


> 


ll/llio  ‘  Ill’ll 


=  N, 


1 — yoo 


and  so 


ns (/)  ns (/)  >  N. 


□ 


Proof  2  of  Theorem  6.  Without  loss  of  generality,  assume 


=  1.  By  Hdlder’s 


inequality, 


ns (/)  ns (/)  = 


> 


2  ; 
oo 
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where  the  last  equality  is  clue  to  Plancherel’s  theorem. 
Define  y  /,  then 


....  S  sup  II  ||  u  1— ¥OC  /-r-r 

Iblll  y*  0  |M|l  VN 


where  F  1  is  the  inverse  Fourier  transform  operator,  and  so 


!L=  Ml  >  1  =N 

l  lie-^IIL  -  lie-'IILco  ' 


Therefore 


ns (/)  ns (/)  >  N. 


A  third  proof  of  Theorem  6  uses  interesting  techniques.  We  start  by  proving  a 
few  lemmas. 

Lemma  5.  Let  G  be  a  finite  abelian  group  and  suppose  f  £  £(G)  satisfies 
||/||i  <  CVK\\f\\2.  Then  there  exists  T  C  G  of  size  K  such  that 


\\f-PTfh<C\\f\\2. 


Proof.  Let  T0  C  G  denote  the  indices  of  the  K  largest  entries  of  /  in  absolute  value, 
and  for  each  j  >  1,  let  Tj  denote  the  indices  of  the  K  largest  entries  of  /  not  covered 
by  T{  for  i  <  j.  Then 


\pTj+1fh  <  VK  max  \f[g]\  <  VKmm\f[g}\  <  — =\\PTJ\\i 

9&Tj+ 1  g£Tj  V  K 


for  every  j  >  0.  As  such, 


ii/ - Prjh  =  Y.pT,i  <  E nu-yiu <  -4 E 

7>1  2  7>1  V  >0 


\PrJ\\i<C\\f\\2, 


where  the  last  step  uses  the  hypothesis.  Taking  T  =  Tq  then  gives  the  result.  □ 
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What  follows  is  a  loose  inequality  that  we  will  use  along  with  other  techniques  to 
prove  Theorem  6. 

Lemma  6.  Let  G  be  a  finite  abelian  group.  Then  every  f  G  d(G)  satisfies 

ll/llill/lli  >c/w||/n!, 


where  C  —  |(1  —  ^). 

Proof.  Let  ||/||2  =  1  without  loss  of  generality  (the  result  clearly  holds  for  f[g\  =  0). 
First,  we  quickly  rule  out  the  case  where  |_9||/||iJ  >  N.  Here,  we  have  ||/||i  >  a /N /3 
and  \\f\h  >  ||/||2  =  H/H2  =  1,  and  so 

ll/llill/lli  >  ivW  >  Cv^ll/lli, 


as  desired. 

For  the  more  interesting  case  where  |_9||/||?J  <  N,  let  T  C  G  denote  the  indices 
of  the  |_9||/||fJ  largest  entries  of  /  in  absolute  value,  and  W  C  G  the  indices  of 
the  |_9||/||iJ  largest  entries  of  /  (this  is  possible  since  |_9||/||iJ  <  N).  Also,  take 
£t  1=  11/lli/vTf  and  -  ll/IK/yirq.  Then 

ll/l|l  =  7^=CMll/l|2  =  £TVnil/ll2. 

and  similarly,  \\f\\i  =  ew\/\W\ ||/||2-  As  such,  Lemma  5  implies  that  /  satisfies  the 
hypothesis  of  Theorem  2,  and  so 

=  \T\\W\  >  N(l-  (eT  +  ew))2. 

\  er  ew  J 
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We  claim  that  G  [1/3,  l/\/8],  which  implies  the  result: 


ii/ii;ii/ii?  >  N  =  c^N\\fr2. 

Rearranging  then  yields 

\\f\\i\\f\\l>N(eTew(l-(eT  +  ew)))2. 

To  verify  the  claim,  note  that  L^II/lliJ  <  9||/||i,  and  so 

ll/lli  ll/lli  ^  ll/lli  1 

T  vm  vww\~vwm  3' 

Also,  since  ||/||i  >  ||/||2  =  1,  we  have  L9ll/||iJ  >  9||/||?  -  1  >  8||/||f ,  which  similarly 
implies  Et  <  l/\/8-  The  same  logic  gives  Ew  G  [1/3, 1  / -\/8] .  □ 

Notice  that  Lemma  6  is  a  weak  version  of  Theorem  6,  but  paradoxically,  it  actually 
implies  Theorem  6.  This  can  be  proved  using  a  technique  called  the  tensor  power  trick 
by  Tan  [21].  First,  we  define  tensor  power. 

Definition  4.  For  gi,  ■  ■  ■  ,gk  G  G  the  tensor  power  of  a  function  f  G  £(G),  denoted 
f®k  is  de fined  as 

f0k[gi,---  ,9k]  :=  f[gi]  ■■■  f[9k\- 

where  {gu  ■  ■  ■  gk)  G  Gk . 

In  order  to  use  f®k  and  its  Fourier  transform  we  need  to  understand  Gk.  The 
tensor  power  trick  is  only  useful  if  we  can  show  that  Gk  =  Gk.  Appendix  1.1  gives 
details  that  prove  this  statement.  Since  Gk  =  Gk,  we  know  =  f®k.  With  this 
information,  we  can  prove  the  following  lemmas  about  f®k  and  f®k\ 
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Lemma  7.  Let  p  >  1  and  f®k  be  defined  as  above.  Then 

ii/®1„  =  ii/iij. 

Proof.  We  begin  with  the  definition  of  ||/0fc||^. 

ii.ni?  =  v  |/®‘[91, . . . ,gt] |»  =  e  -  ■  E  i/faii ■  ■  ■  /fcir 

(9i,---fffc)eGfe  gi&G  gk&G 

9i£G  gk 

Taking  the  pth  root  of  each  side  yields  the  result.  Therefore,  ||/®fc||p  =  \\f\\p.  □ 

Lemma  8.  Let  f®k  be  defined  as  above.  Then 

Il7®ii  =  ll/ll?- 

Proof.  By  definition  of  the  Fourier  transform,  and  Claims  3  and  5  in  Appendix  1.1, 

=  E  =  E  •••  Ei(F®/)ki]'"(^/)Wi 

X&Gk  XiSG  XfeSG 

=  Ei(F=/)ki]i"'Ei(F®/)Wi- 

XiSG  XfcSG 

where  the  last  term  is  the  definition  of  ||FG/||^.  Therefore  \\FGkf®k\\i  =  II^g/IIi-  □ 
Now,  we  are  ready  to  prove  Theorem  6  using  the  tensor  power  trick. 

Proof  3  of  Theorem  6.  Due  to  Plancherel’s  theorem  we  can  rewrite  (15)  as 

ll/llill/lli  >  vGv||/||! 
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By  Lemmas  6,  7  and  8  and  we  know  that 


iiriiii7%  >  cVN*\\f®k\\i 

implies 

||/||f||/||?>c(V7v)'i  ||/||?. 

Taking  the  kth  root  of  both  sides  yields 

ll/llill/lli  >  C^s/NWfWl 

Taking  the  limit  as  k  — »  oo  we  get  C 1//fc  — >  1.  Thus, 

n/iiiii/iii  >  '/N\\ni 


completing  the  proof.  □ 

2.6  Equality  in  the  mixed-norm  uncertainty  principle 

In  this  section,  we  will  prove  a  surprising  result:  that  equality  in  the  mixed-norm 
uncertainty  principle  is  achieved  if  and  only  if  equality  is  achieved  in  the  0-norm 
uncertainty  principle  (the  “only  if’  direction  is  particularly  surprising  considering 
Lemma  4).  In  order  to  satisfy  equality  in  Theorem  6,  the  Fourier  transform  pair  / 
and  /  must  have  a  specific  structure: 

Theorem  7.  If  f  satisfies  ns(/)ns(/)  =  N,  then  f  =  cMa Ia  where  c  E  C,  1a  is 
the  indicator  function  for  some  set  A,  and  Ma  is  the  modulation  operator  defined  as 
(. Maf)[n \  :=  f[n\e2nian/N . 


Proof.  We  can  see  from  the  first  two  proofs  of  Theorem  6  that  in  order  to  achieve 


equality  in  Theorem  6,  /  and  /  must  satisfy  equality  in  Holder’s  inequality,  specifi¬ 
cally, 

n/iiiii/iioo  =  ii/iii,  imiiimiooHi/l-  as) 

To  achieve  the  first  equality  in  (16), 

I/Ml2  =  ll/lll  =  ll/IMI/lloo  =  max  |  /  [m\  |  ^ \f[n}\  =  max \f[m\  \ \f[n\\. 

z — J  meG  z — '  z — J  m 

neG  neG  neG 

Therefore,  maxme<3  |  /  [m]  |  =  |/[n]|  for  all  n  such  that  /[n]  ^  0.  Similarly,  in  order 
for  the  second  equality  in  (16)  to  be  true,  maxmg(j  |/[m]|  =  |/[n]  for  all  n  such  that 
f[n]  7^  0.  The  additional  constraint  for  /  and  y  f  to  satisfy  equality  in  Theorem  6 
is  equality  in  the  induced  norm: 

IM/lloo  =  IMIli^ooll/lli,  IM'Mloo  =  IMMli^oolMli-  (17) 


By  definition  of  the  Fourier  transform, 


IM/lloo  =  max  |F f[m\\ 
meG 


max 

meG 


1 

Vn 


/Me 

neG 


2'K\mn/N 


l 

Vn 


/Me 

neG 


2'K\ihn/N 


where  m  is  the  index  that  achieves  this  maximum.  We  know  that  HFUi^oo  =  1/VN. 
Therefore,  we  we  have 


1 

Vn 


1 

Vn 


£i/f 

neG 


n  = 


Vn 


El/f 

neG 


n\e 


—  2Trimn/N  I 


where  the  last  equality  comes  from  the  fact  that  |e  27rimn/A  |  =  1.  Let 
v[n]  :=  f[n\e~2mmn^N .  Then  we  can  rewrite  the  first  equality  in  (17)  as 


XmM  =  MNI- 

neG  neG 
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The  statement  is  true  if  and  only  if  for  all  indices  n,  m  such  that  v[n]  0  and 
v[m]  0,  sgn(u[n])  =  sgn(u[m]).  Thus,  3c  G  C  and  z  >  0  entrywise  such  that  v  =  cz. 
Substituting  in  the  definition  of  v  and  solving  for  /  we  get 

f[n\  =  cz[n]e2”/JV 

Therefore,  to  satisfy  equality  in  Theorem  6,  /  and  /  have  the  form  /  =  cMalA ■  □ 

Theorem  7  above  can  be  used  to  prove  the  main  result  of  this  section: 

Theorem  8.  ns(/)ns(/)  =  N  if  and  only  if  ||/||o||/||o  =  N. 

Proof.  By  Lemma  4  and  Theorem  1,  we  know  that 

ll/lloll/llo  >  ns(/)ns(/)  >  N. 

Therefore,  functions  which  satisfy  equality  for  Theorem  6  form  a  subset  of  the  func¬ 
tions  that  achieve  equality  for  Theorem  1. 

Assume  that  /  satisfies  equality  in  Theorem  6.  Then  using  Theorem  7,  the  nu¬ 
merical  sparsity  of  /  can  be  calculated  as  follows: 

JI/JH  (EL,  I/Ml)2  (EL,  |CMn,M|)2  m.c, 

'[J>  ll/ll!  Il/llooll/lli  C-Zn=i\CM«lA[n}\  ll/llo-C2  ll/l10' 

Thus,  the  set  of  functions  that  achieve  equality  in  Theorem  6  is  the  same  set  that 
achieves  equality  in  Theorem  1.  □ 

Consider  the  case  when  K  divides  N  and  /  is  a  Dirac  comb  supported  on  multiples 
of  K .  ft  is  well  known  that  /  is  a  Dirac  comb  supported  on  multiples  of  N/K ,  and 
so  ll/lloll/llo  =  N.  Thus,  by  Theorem  8,  /  achieves  equality  in  Theorem  6  in  this 
special  case.  As  such,  it  is  possible  to  nontrivially  achieve  equality  in  Theorem  6  if 
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and  only  if  N  is  composite  (the  “only  if’  direction  follows  from  the  fact  that  ||/||o 
and  ll/Ho  are  both  integers).  In  the  next  section,  we  give  an  example  of  a  function 
that  achieves  near  equality  in  Theorem  6  regardless  of  N. 

2.7  Near  equality  in  the  mixed-norm  uncertainty  principle 

In  this  section,  we  will  show  that  a  discretized  and  periodized  Gaussian  function 
achieves  near  equality  in  the  mixed-norm  uncertainty  principle  in  the  following  sense: 

Theorem  9.  There  exists  a  C  >  0  independent  of  N  and  f  €  i(/N)  such  that 


ns (/)  ns (/)  <  CN. 


Once  again,  let  /  =  as  defined  in  (11)  with  K  =  y/N .  Since  /  =  /,  it 
suffices  to  show  that  ns (/)  <  / CN.  A  proof  of  this  statement  requires  the  use  of  the 
following  lemma: 

Lemma  9.  Given  h(K'1  where  K  =  y/N  ,  ||h^||i  <  |N3/4. 

Proof.  By  definition, 


N 


N 


\h{K)[n]  \  = 


n=  1 


n=  1 


E 

n'£Z 


exp 


-7T 


n 


y/N 


n 


y/N 


Because  exp  (y)  >  0  for  all  t/6R  the  above  we  then  have 


N 


\\h{K)\\i  =  ^N1*  J^exp 


n=  1  n'eZ 


(18) 


The  following  is  an  upper  bound  on  (18)  due  to  integral  comparison. 


N 

n=l 


exp 


-7T 


n 


y/N 


x 


y/N 


N 

dx  +  N2  exp 

n=  1 


-7T 


n 


(19) 
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First  we  will  find  an  upper  bound  on  the  left  hand  side  of  (19).  Substituting 
u  =  y rTr(n/\/N  +  xy/N)  into  the  first  term  of  (19)  gives 


N  -t  /»00  N  r— 

N*Y  ,  /  exp[— u2)du  +  N5/4  =  IV3  =  =  N^. 

n=  1  VTTiV  J  oo  n=l 


The  upper  bound  on  the  second  term  of  (19)  is  more  difficult  to  compute.  By  sym¬ 
metry, 


N 

N  3  exp 

71=1 


-1 

IV 3  exp 

n=-N 


(  n  V 

.  r1 

x2 

<  /  exp 
J-N 

~71n_ 

dx. 


We  know  that 


dx  < 


dx 


exp  [—u2]du 


Vn 

~2~’ 


where  u  =  \Jn/Nx.  Therefore,  the  upper  bound  on  the  second  term  of  (19)  is  IV3/4/ 2. 
Combining  this  upper  bound  with  the  one  for  the  first  term  yields  the  result.  □ 

Proof  of  Theorem  9.  It  is  straightforward  to  construct  our  upper  bound  on  ns (hlyK'>) 
using  a  combination  of  Lemma  9  and  Claim  2  (see  Appendix  1.1): 


ns (/jW)  < 


9  N 3/2 


N 

VN-V2 


Thus,  for  some  C  >  0  we  have  ns(/)  ns(/)  <  CN. 


9y/2 

~T~ 


(1  +  ojv(1))  VN. 


(20) 

□ 
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III.  Applications  of  the  uncertainty  principle 


In  Chapter  II,  we  introduced  a  mixed-norm  uncertainty  principle  and  demon¬ 
strated  that  near  equality  is  achieved  with  a  discretized  and  periodized  Gaussian.  In 
this  chapter,  we  will  discuss  two  applications  of  these  results.  First,  we  will  show  a  fun¬ 
damental  limitation  of  the  demixing  problem,  namely,  that  demixing  TV-dimensional 
K -sparse  signals  is  only  stably  possible  in  the  worst  case  if  K  =  0(\/rN).  Second,  we 
will  show  how  to  detect  K -sparse  signals  with  only  0(K )  measurements. 

3.1  Limitations  of  the  demixing  problem 

The  main  idea  of  demixing  is  that  if  a  signal  x  which  is  sparse  in  the  Fourier 
domain  is  corrupted  with  noise  e  that  is  sparse  in  the  time  domain,  then  we  can  use 
compressed  sensing  methods  to  reconstruct  the  original  signal  x  given  the  corrupted 
signal  z  =  x  +  e.  This  reconstruction  is  done  by  solving 

v*  =  argmin  ||v||i  subject  to  [I  F]v  =  Fz , 

where  the  solution  v*  is  a  vertical  concatenation  of  Fx  and  e. 

Coherence-based  guarantees  in  [22]  show  that  it  suffices  for  v *  to  be  Jl-sparse 
with  K  =  O(VN)  while  RIP-based  guarantees  in  [6]  allow  for  K  =  0(N )  if  [I  F] 
is  replaced  with  a  random  matrix.  We  refer  to  this  disparity  as  the  square-root 
bottleneck.  In  particular,  does  [I  F]  perform  similarly  to  a  random  matrix  or  is  the 
coherence-based  sufficient  condition  on  K  also  necessary?  Despite  being  studied  in 
both  [10]  and  [18],  this  fundamental  problem  has  gone  unsolved  for  general  N  until 
now.  In  this  section,  we  use  numerical  sparsity  to  show  that  <h  =  [I  F]  cannot  break 
the  square-root  bottleneck.  We  begin  with  a  definition: 
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Definition  5.  Let  $  e  CMxN .  We  say  that  $  satisfies  the  C-width  property  if 


C 

|a;||2  <  — ^=||a;||i  \/x  E  Ker(<f>). 

v  K 


The  following  theorem  is  provided  in  [23]  and  [24]: 


Theorem  10.  Define 


A (y)  :=  argmin  ||a;||i  such  that  4>a;  =  y. 


Then  3 C  >  0  such  that 


C 

||A(4>a;)  —  a;|| 2  <  — /=||x  —  x.k-||i  Vr  G  Kw 
v  K 


if  and  only  if  3c  >  0  such  that 


|x||2  <  — ^=||x||i  Vr  6  Ker($). 

v  K 


Furthermore,  C  x  c  in  both  directions  of  the  equivalence. 


Based  on  Definition  3,  the  C-width  property  is  equivalent  to  having 


ns(x)  >  K/C 2  Mx  G  Ker (<£>). 


Let  <f>  =  [I  F]  and  be  defined  as  in  (11).  Take  x  e  C2N  such  that 


x  = 


h™ 

-hW 


(21) 


for  K  =  y/N.  We  know  that  x  e  Ker<f>.  We  will  show  that  ns(x)  <  C\[M.  This  is 
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significant  because  by  Theorem  10,  it  provides  a  converse  to  the  square-root  bottleneck 
for  this  well-studied  matrix.  The  proof  of  this  statement  requires  the  use  of  the 
following  lemma: 

Lemma  10.  For  x  as  defined  in  (21),  ns(x)  =  2ns(/?/A^). 

Proof.  We  will  first  show  that  ||:r||2  =  4||/dA)||2.  The  definition  of  x  and  /dA^  gives 

2N 

INK  =  22  \xi\ 

i=  1 

N  2  N 

=  22  M  +  22  i^i 

i— 1  i=7V+l 

N  N  N 

=  E  iftWwi + £  i  -  '’"Th  = 2  E  =  2iiftwii.' 

1=1  1=1  1=1 

Squaring  the  above  yields  ||a;||f  =  4||h^A^||^.  Next,  we  will  show  that  ||x|||  =  2||/i^|||. 
Similar  to  the  above  by  definition, 


2  N 

\x\\l  =  Y'  \x.  12 


E 

n=l 

N 

22^2  +  22 

n=  1  n=N+ 1 

N  N 

E  lftWWI2  +  E I  -  ftmNI2  = 2 E  lfcwNI2  =  2||ft(K>|| 


27V 


n=  1 


n=l 


n=  1 


Combining  the  above  results  gives  us 


IMI2 

/  \  x  l 

nsW  =  WIN 
||x  1 1 2 


4||^)||l2 

2||hW||22 


2  ns  (h{K)), 


completing  the  proof 


□ 
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Therefore,  based  on  Lemma  10  and  (20)  we  have 


nsW  -  =  71 (1  +  0n{1)) 

as  desired. 

3.2  Fast  detection  of  sparse  signals 

In  this  section,  we  will  show  that  with  high  probability,  an  arbitrary  A'-sparse 
signal  can  be  detected  using  0(K )  measurements.  We  want  to  be  able  to  distinguish 
between  a  sparse  signal  and  the  zero  signal.  Therefore,  we  are  testing  the  following 
hypotheses: 


H0  :  x  =  0 

Hi  :  ||x||!  =  \\x\\0  <  K. 

Here,  we  assume  we  know  the  2-norm  of  the  sparse  vector  we  intend  to  detect,  and 
we  set  it  to  be  \J N/K  without  loss  of  generality  (this  choice  of  scaling  will  help  us 
interpret  our  results  later).  We  will  assume  the  data  is  accessed  according  to  the 
following  query-response  model: 

Definition  6  (Query-response  model).  If  the  ith  query  is  G  7LN,  then  the  ith 
response  is  (Fx)[ni\  +  €i,  where  the  e^s  are  iid  complex  random  variables  with  some 
distribution  such  that 

E|ej|  =  a,  E\ei\- = /32. 

The  coefficient  of  variation  v  of  |e8|  is  defined  as 

2  Var|ej|  /32  —  a2 
V  =  (EM)2  =  a2  ‘ 
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Note  that  for  any  scalar  0,  the  mean  and  variance  of  |  ce*  |  are  \c\ol  and  |c|2  Var  |e;|, 
respectively.  Hence,  v  is  scale  invariant  and  is  simply  a  quantification  of  the  “shape” 
of  the  distribution  of  e*|.  We  will  evaluate  the  responses  to  our  queries  with  an 
Prdetector,  defined  below. 

Definition  7  (t) -detector).  Fix  a  threshold  t.  Given  responses  from  the 

query-response  model,  if 

M 

5^ls/i|  >  F 

i= 1 

then  we  reject  H0 . 

The  following  is  the  main  result  of  this  section: 

Theorem  11.  Suppose  a  <  1/(8 K).  Randomly  draw  M  independent  indices  uni¬ 
formly  from  ZN,  input  them  into  the  query-response  model  and  apply  the  l\- detector 
with  threshold  r  =  2 Ma  to  the  responses.  Then 


Pr  (  reject  If  , 


Ho)  <P 


and 


Pr  (  fail  to  reject  H0 


<p  +  q 


provided  M  >  4 K/q  +  v2/p. 


In  words,  the  probability  that  the  fd-detector  delivers  a  false  positive  is  p  and 
the  probability  that  it  delivers  a  false  negative  is  p  +  q.  These  error  probabilities 
can  be  estimated  better  given  more  information  about  the  distribution  of  the  random 
noise,  and  the  threshold  r  can  be  modified  to  decrease  one  error  probability  at  the 
price  of  increasing  the  other.  Notice  that  we  only  use  0(K )  samples  in  the  Fourier 
domain  to  detect  a  K -sparse  signal.  Since  the  sampled  indices  are  random,  it  will  take 
O(logiV)  bits  to  communicate  each  query,  leading  to  a  total  computational  burden 
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of  0(K log  N)  operations.  We  suspect  K -sparse  signals  cannot  be  detected  with 
substantially  fewer  samples  (in  the  Fourier  domain  or  any  domain). 

We  also  note  that  the  acceptable  noise  magnitude  a  =  0(1/ K )  is  optimal  in  some 
sense.  To  see  this,  consider  the  case  where  K  divides  N  and  a:  is  a  Dirac  comb  of  K 
deltas.  Then  Fx  is  a  Dirac  comb  of  N/K  deltas.  (Thanks  to  our  choice  of  scaling, 
each  delta  in  the  Fourier  domain  has  unit  magnitude.)  Since  a  proportion  of  1/K 
entries  is  nonzero  in  the  Fourier  domain,  we  can  expect  to  require  0(K )  random 
samples  in  order  to  observe  a  nonzero  entry,  and  the  £i-detector  will  not  distinguish 
the  entry  from  accumulated  noise  unless  a  =  0(1/ K).  We  will  use  the  following 
lemmas  to  prove  Theorem  11. 


Lemma  11.  Suppose  ||a;||o  <  K  and  ||a;||| 
Y  :  =  \(Fx)[n)\.  Then 


ED  > 


1 

K' 


=  N/K.  Draw  n  ~  Unif(Zjv)  and  define 


E  Y2  = 


1 

K' 


Proof.  By  Lemma  4  we  know  ns(x)  <  ||a;||o  <  K,  and  Theorem  6  gives 


N  <  ns(x)  ns  (Fx)  <  K  ns  (Fx). 


Rearranging  and  the  definition  of  numerical  sparsity  then  gives 


N 

K 


<  ns  (Fx) 


\\Fx\\i 

\\Fx\\l 


N/K 


where  the  second  to  last  equality  is  due  to  Plancherel’s  Theorem.  Thus  ||ir'a;||i  > 
N/K.  Then 


EF  =  h  £  y” =  h'Z l(Fx)HI  =  ^l|Fa:|11  -  h 
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and 


Ei"2  =  ^ E  Y'  =  fn  k^)hi2  =  ^ii^m  =  f 

nGZjv  nEZn 

Lemma  12.  Take  e1 ,...  ,€m  to  be  iid  complex  random  variables  with  E|es 
E|e.j|2  =  f32.  Then 

provided  M  >  v2  /p,  where  v  is  the  coefficient  of  variation  of  \et 
Proof.  Chebyshev’s  inequality  gives 

Var  X 

Pr(|A-EA|  >t)  < 


M 


^  |ej|  <  Ma  >  1  —  p 


i= 1 


Take  X  =  Yl?iL1  |  e*| .  Then 

M 

EA  =  ^^E|ei|  =  ME|e*|  =  Ma 

i= 1 

and 

Var  A  =  M  Var  |e*|  =  M(E|ei|2  -  (E|e;|)2)  =  M(/32  -  a2). 


Thus, 


Pr 


M 


>  Ma  +  t  =  Pr  E  |ej|  —  Ma  >  t 


.  i— 1 


<  Pr(|A  -  EA|  >t)< 


Var  A  M(/32  —  a2) 


t 2 


t2 


Take  t  =  Ma.  Then 


Pr 


>  2  Ma 


<  M 


(l2  -  «2) 

(Ma)2 


f32  —  a2  (32  —  a2  p 
Ma2  ~  a2  v2 


□ 

a  and 
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which  gives  the  result. 


□ 


Lemma  13.  If  M  >  AK/q  +  v2/p,  and  a  <  1/(8 K),  then 


M 


Pr  i^i  < 


r 


,  2=1 


„  „  ll2  N  . 

kilo  <  K,  \\x\\2  =  —  )  =p  +  q 


Proof.  We  know  that 


M 


M  M 


Pr  E  \Ui\  <  2 Ma  )  <  Pr  E  Yi  —  ^  |e*|  <  2  Ma 


,  i=  1 


,2=1  2=1 


For  simplicity  let  b  =  Yi  —  le*l-  By  Lemma  12,  we  then  have 


M 


(22) 


Pr(6  <  2 Ma)  =Pr  b  <  2 Ma 


M 


M 


El£-I>2  Ma  Pr  E  |  ti  |  >  2 Ma 


i= 1 


,  *= i 


+  Pr  \b  <  2  M a 


M 


M 


J2h\<2Ma  Pr  E  |ej|  <  2Ma 


i=  1 


.i=l 


M 


<p  +  Pr  E  Yi  <  4 Ma  . 


(23) 


,  2=1 


Chebyshev’s  inequality  gives 


Pr(|A'  -  EX\  >t)<  YY. 


Taking  X  =  Y^UYn  Lemma  11  gives  EX  =  MKYt  >  §  and  VarX  =  MVaxY,. 
Thus, 

Pr  (E  Y‘  <  f  -  ()  S  Pr<A'  S  EA'  Pr(|A  -  EA|  >  i)  <  YY1  <  fL. 
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Taking  t  =  M /  (2 K )  gives 


M 


M 


Pr  E  Yi  <  AM  a  I  <  Pr  £*< 


.  i= 1 


,  i= 1 
M 


M 

2K 


M 


.  i—  1 


< 


M 


_  M  /2 Ky  _  AK 
Kp-Ji  (  m" )  ~J 7  <q' 


(24) 


Combining  (22),  (23),  and  (24)  gives  the  result. 

Thus,  Lemma  13  and  12  together  prove  Theorem  11. 


□ 
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IV.  Fast  hyperspectral  imaging  using  compressed  sensing 


Astronomers  have  been  attempting  to  analyze  the  spectra  of  stars  for  over  a 
hundred  years  [25].  With  modern  technology,  scientists  can  look  at  hyperspectral 
images  of  stars  and  use  the  data  to  determine  their  chemical  makeup.  This  concept 
is  known  as  spectroscopy.  The  main  phenomena  of  interest  are  absorption  lines  in  the 
spectral  bands.  Absorption  lines  are  a  significant  drop  in  the  intensity  of  the  light  at 
a  given  frequency,  or  spectral  band,  compared  to  the  distribution  of  the  surrounding 
spectral  bands  (see  Figure  6).  The  locations  of  these  absorption  lines  can  be  matched 
with  the  light  emitted  from  super  heating  particular  elements  of  the  periodic  table, 
which  then  allows  scientists  to  determine  the  chemical  composition  of  a  star. 

Conventional  hyperspectral  cameras  are  slow.  Different  methods  of  hyperspectral 
imaging  either  require  time  to  process  the  entire  space  desired  or  greatly  limit  the 
number  of  spectral  bands  [27].  Before  describing  these  methods,  it  is  helpful  to  intro¬ 
duce  some  notation.  Let  X  =  f(x,y,  A)  be  a  function  representing  the  hyperspectral 
image  we  wish  to  measure.  This  data  cube  is  made  up  of  two  spatial  dimensions  with 
coordinates  represented  by  x  and  y  and  one  spectral  dimension  whose  coordinate  is 
represented  by  A.  In  particular,  for  any  fixed  spatial  coordinates  (x,y),  varying  A 
gives  the  spectrum  of  light  received  at  (x,y). 

There  are  two  conventional  methods  for  hyperspectral  imaging:  spectral  scanning 
and  spatial  scanning.  Spectral  scanning  involves  the  use  of  filters,  such  as  bandpass 
filters,  to  collect  a  two-dimensional  slice  of  X  that  contains  only  one  fixed  spectral 
band  A  and  all  spatial  information.  Filters  are  commonly  installed  on  wheels  that 
rotate  in  front  of  the  lens,  allowing  only  one  spectral  band  to  be  captured  at  a 
time  [13,  27].  Thus,  time  is  needed  to  collect  all  spectral  bands  in  X. 

There  are  two  types  of  spatial  scanning  methods.  Point-scanning  (or  whisk-broom) 
methods  rely  on  moving  mechanical  parts  in  the  camera.  This  method  uses  a  mirror 
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RA=236.39 1 39,  DEC=  0.40853,  MJD=51691,  Plate=  342,  Fiber=435 


Figure  6.  Spectrum  of  a  star  (object  identification  number  587722953303982115)  from 
the  Sloan  Digital  Sky  Survey  [26].  Here,  we  see  the  spectrum  of  this  star  over  a 
wavelength  range  of  400-900  nm.  As  such,  this  data  includes  both  visible  and  infrared 
light.  Though  the  data  is  noisy,  there  are  at  least  seven  distinct  absorption  lines 
throughout  the  spectra. 


to  scan  across  all  x  coordinates  of  X  for  each  y  coordinate.  The  mirror  then  reflects 
the  light  from  a  fixed  point  in  space  through  a  pinhole  and  then  a  prism  that  disperses 
the  light  before  it  is  recorded  by  charge-coupled  device  (CCD).  Therefore,  spectral 
information  of  only  one  point  in  space  is  captured  by  the  camera  at  a  time.  The  need 
to  scan  in  both  spatial  directions  is  time  consuming  and  requires  complex  hardware 
that  is  prone  to  degrade  over  time.  On  the  other  hand,  whisk-broom  scanning  has 
the  advantage  of  only  requiring  one  detector  to  calibrate  [13,  27]. 

The  line-scanning  (or  push-broom)  method  involves  passing  light  through  a  ver¬ 
tical  slit  so  that  all  spectral  bands  of  the  desired  image  are  collected  for  some  fixed 
spatial  coordinate  x.  Therefore,  a  two-dimensional  slice  of  the  data  cube  is  collected 
(with  one  spatial  and  one  spectral  dimension).  The  light  is  passed  through  a  prism 
that  disperses  the  different  frequencies  before  it  is  collected  by  a  CCD  [13,  27].  The 
major  limitations  of  the  spatial  scanning  methods  are  (1)  the  amount  of  time  it  takes 
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coded  aperture  disDerser 


Figure  7.  Schematic  for  the  coded  aperture  snapshot  spectral  imager  [27].  The  coded 
aperture  mask  in  question  is  a  micro-mirror  array  and  a  prism  is  used  as  the  disperser. 
In  this  chapter,  we  consider  a  similar  sensing  platform,  which  we  model  in  Section  4.1 

for  the  long  exposure  required  to  image  each  portion  of  the  data  cube  and  (2)  the 
necessity  to  move  the  camera  in  order  to  capture  each  portion. 

Luckily,  the  distribution  of  stars  throughout  space  is  highly  sparse.  Therefore, 
instead  of  using  time-consuming  scanning  methods  to  collect  hyperspectral  data,  we 
can  apply  compressed  sensing  methods.  The  mechanism  we  will  use  is  similar  to 
other  spectroscopy  mechanisms  in  that  it  involves  the  use  of  a  prism  or  some  other 
dispersion  device  before  the  data  is  collected  on  a  CCD. 

Unlike  other  spectroscopy  platforms,  this  platform  makes  use  of  a  micro-mirror 
array  (MMA)  [15],  that  is,  a  rectangular  grid  of  mirrors,  each  corresponding  to  a 
spatial  location  (x,y).  Each  mirror  is  oriented  either  to  reflect  all  available  light  at 
(x,  y)  through  the  prism  onto  the  CCD  or  to  reflect  that  light  away  from  the  prism 
and  CCD  so  that  it  is  not  measured  (see  Figures  4  and  7).  Therefore,  instead  of 
measuring  the  data  cube  one  slice  at  a  time,  we  measure  various  combinations  of  data 
cube  entries.  We  intend  to  exploit  the  data  cube’s  sparsity  in  order  to  reconstruct  it 
from  these  measurements. 

Other  methods  of  compressive  hyperspectral  imaging  have  been  developed  re¬ 
cently.  One  method  is  called  chromotomography ,  which  is  a  generalized  scanning 
method  that  does  not  require  spatial  filtering  such  as  the  slit  used  in  line  scanning 
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methods.  Rather,  chromotomography  involves  a  rotating  prism  [11,  27].  A  disadvan¬ 
tage  of  this  method  is  that  the  dispersion  elements  required  are  difficult  to  produce 
[27].  Another  method  is  called  coded  aperture  imaging,  which  is  a  snapshot  method 
that  uses  coded  apertures  (e.g.,  MMA)  in  place  of  other  spatial  filtering  devices  used 
for  scanning  methods  [28,  29].  The  model  we  propose  is  similar  to  the  coded  aperture 
systems  developed  in  [28,  29]. 

4.1  Modeling  the  sensor 

Let  X  =  f(x,y,X )  be  a  data  cube  like  that  in  Figure  3  with  spatial  dimensions 
of  sizes  J  and  L  and  with  N  spectral  bands,  that  is,  X  e  MJxLxiV.  The  entire 
measurement  platform  (i.e.,  the  MMA,  prism,  and  CCD)  can  be  represented  as  a 
linear  operator  Av,  where  v  represents  the  orientation  of  the  MMA’s  mirrors.  We 
will  construct  Av  in  terms  of  two  operators  and  T  such  that  Av  =  where 

is  a  model  of  the  MMA  and  T  is  a  model  of  the  prism  and  CCD  combined.  As 
mentioned  in  the  previous  section,  the  MMA  either  reflects  all  spectral  bands  for  a 
given  point  (x,  y )  towards  the  prism  and  CCD  or  reflects  all  spectral  bands  away  from 
the  CCD.  As  such,  the  MMA  transforms  the  data  cube  according  to  a  linear  operator 

:  MJxLxiV  ^  RJxLxN  defined  by 


*&v&(x,y, A)  •  ^(x,y)^(x,y,\)  (25) 

where  V(x^  =  1  if  the  mirror  at  (x,  y)  allows  light  through  and  is  otherwise  zero.  Next, 
the  prism  disperses  the  different  frequencies  of  light  by  different  amounts  before  the 
CCD  records  light  intensities.  We  model  this  by  normalizing  x,  y  and  A  so  that 
x  G  {1, . . . ,  J},  y  G  {1, . , . ,  L}  and  A  G  {1, . . . ,  N},  and  then  modeling  the  dispersion 
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process  with  a  linear  operator  T  :  ^JxLxN  _ >  ^Jx(L+N  ')  defined  by 

•  3(x,y+\—  1)  •  (26) 

That  is,  light  at  (x,  y )  of  wavelength  A  contributes  to  the  (x,  y  +  A  +  l)th  entry  of 
the  CCD.  Indeed,  in  our  model,  the  prism  is  oriented  so  that  dispersion  acts  in  the 
y  direction.  At  this  point,  we  note  that  (\l/<3>„Al)[:r,  •]  is  completely  determined  by 
X[x,  *,  •],  meaning  we  may  simplify  our  analysis  by  considering  these  slices  of  A"  in 
parallel.  As  such,  for  the  remainder  of  this  thesis,  we  will  assume  J  =  1  without  loss 
of  generality. 

Let’s  briefly  describe  the  matrix  representation  of  Av.  We  choose  the  identity  basis 
for  both  ulxix7V  and  Klx(L+Ar~1).  Specifically,  we  order  the  basis  elements  of  MlxLxiV 
as  i,i),  5(i,  1,2),  •  •  • ,  5(i,i,jv),  5(i, 2,1),  •  •  • ,  5(i, i, at).  Then  z  is  a  vector  of  coefficients  of 
A"  in  this  basis.  By  (25),  T,,  is  a  multiplication  operator  that  only  depends  on  y. 
As  such,  its  matrix  representation  in  this  basis  is  a  diagonal  matrix  composed  of  L 
diagonal  blocks,  where  the  yth  block  is  v^^Inxn-  By  (26),  T  is  a  translation  and  sum 
operator  that  depends  on  y  and  A.  The  matrix  representation  uses  the  translation 
operator  T,  where 

Ty-%  :=  5,+a_i 

for  identity  basis  elements  5A  G  £(7jL+n_i).  We  apply  this  cyclic  translation  with  the 
help  of  a  zero-padding  matrix: 

Inxn 

0l-1xN 

We  then  write 

T  :=  [B  TXB  Tl~1B\.  (27) 
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Overall,  we  have  the  measurement  matrix  Av  =  Tcfy,  when  the  MMA  mirrors  are 
oriented  according  to  v.  Given  multiple  exposures  with  varying  MMA  orientations 
{vi\i=ii  our  observations  can  be  organized  as  y  —  Az,  where 


A  = 


A. 


Vi 


A. 


VQ 


(28) 


In  total,  A  has  M  :=  Q(N  +  L  —  1)  rows  and  P  :=  LN  columns. 


4.2  Modeling  the  data  cube 

Recall  from  the  previous  section  that  the  data  cube  is  1  x  L  x  N  without  loss  of 
generality,  and  we  reshape  this  cube  to  form  the  vector  z  G  MiA  =  Mp.  In  particular, 
the  first  length- A"  block  of  z  corresponds  to  the  spectrum  with  spatial  coordinates 
(1, 1)  and  the  second  length- A  block  corresponds  to  (1,2),  etc.  Since  the  scene  will 
only  have  a  few  stars,  we  have  that  z  is  sparse,  with  the  yth  block  giving  the  spectrum 
of  the  star  at  (1,  y)  unless  no  such  star  exists,  in  which  case  the  yth  block  is  identically 
zero.  Therefore,  the  nonzero  entries  of  z  are  clustered  into  blocks,  and  we  say  that  z 
is  block  sparse,  as  defined  below. 

Definition  8  (Block  sparsity).  Let  z  be  a  vector  of  length  P  such  that  z  is  a  concate¬ 
nation  of  blocks  of  length  N.  We  say  that  z  is  K -block  sparse  if  at  most  K  blocks  in 
z  are  nonzero.  We  denote  the  nth  block  of  z  by  zn. 

When  a  star  is  present  in  the  data,  we  model  the  spectral  radiance  of  the  star 
as  black-body  radiation.  As  such,  we  use  Planck’s  Law  as  part  of  our  model  of  the 
nonzero  blocks  of  z  [30]. 
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Figure  8.  Example  of  star  simulated  in  MATLAB  with  temperature  of  6380.9  K  and 
ten  absorption  lines  using  stardata.m  (See  Appendix  B).  This  curve  closely  resembles 
the  real-world  data  in  Figure  6. 

Definition  9  (Planck’s  Law).  The  spectral  radiance  of  a  black  body  is  a  function  of 
wavelength  and  temperature: 


where  c,  h  and  k  are  the  speed  of  light,  the  Planck  constant,  and  the  Boltzmann  con¬ 
stant,  respectively.  Additionally,  X  is  wavelength  in  meters  and  T  is  temperature  in 
Kelvin. 

As  mentioned  in  the  beginning  of  this  chapter,  absorption  lines  in  the  spectral 
radiance  of  stars  indicate  the  elements  that  are  present  in  said  star.  We  can  identify 
what  elements  are  present  because  these  absorption  lines  are  the  negative  of  the 
emission  lines  generated  when  those  elements  are  heated  to  extreme  temperatures  [31]. 
In  the  absence  of  interference,  these  lines  exhibit  Lorentzian  shapes  [32,  33].  For  each 
star,  we  model  s  absorption  lines  in  the  spectral  data  as  negative  Lorentzian  functions. 
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In  particular,  an  absorption  line  at  wavelength  A*  is  given  by 


HiW''  1  +  (A  -  A*)2//?’ 

where  a  and  /3  are  shape  parameters.  Data  collected  for  a  star  with  absorption  lines 
at  Ai, . . . ,  Xs  is  modeled  as  L( A,  T)  +  J2l=i  Figure  8  illustrates  an  example  of 

this  simulated  data. 


4.3  Block  orthogonal  matching  pursuit 

In  this  section,  we  define  the  block  orthogonal  matching  pursuit  (BOMP)  al¬ 
gorithm  and  conditions  for  guaranteed  reconstruction  of  the  desired  data.  Take 
A  G  MMxP  and  let  z  G  Mp  be  a  K- -block  sparse  vector  with  blocks  of  size  N  =  P/L. 
BOMP  is  an  iterative  greedy  algorithm  that  recovers  z  from  w  =  Az.  Let  be  the 
ith  M  x  N  block  of  A  and  z3  be  the  jth  block  of  z.  Initialize  X°  to  be  the  empty  set, 
the  residual  r°  =  w ,  and  z°  as  the  zero  vector.  On  the  Ith  iteration  of  BOMP,  an 
additional  element  of  the  support  X  is  found: 


X  =  I1  1  U  argmax  ||Al*'r  ||2 


(30) 


Then  given  X1,  we  solve  for  zl  using  least-squares  minimization: 


zl  =  argmin 


W  -  ^  Ai*i 
iei1 


(31) 


Lastly,  we  update  the  residual: 


rl =  w  —  ^  Aiz\. 

i&l1 


(32) 
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In  the  case  of  traditional  sparsity,  if  a  matrix  A  has  small  coherence,  defined  below, 
it  has  been  shown  that  A  can  be  used  to  successfully  recover  any  sparse  signal. 

Definition  10  (Coherence).  For  a  matrix  A,  define  the  coherence  of  A,  p(A)  as 

p(A)  :=  max  |a*a,-| 
hj¥=i 

Specifically,  it  is  known  that  orthogonal  matching  pursuit  will  recover  any  K- 
sparse  z  from  w  =  Az  provided  /i  >  (2 K  —  l)-1  [9].  Eldar  et  al.  [16]  developed  two 
analogous  coherence  definitions  that  lead  to  a  similar  guarantee  for  block  orthogonal 
matching  pursuit.  They  define  block-coherence  as  follows: 

Definition  11  (Block-coherence).  For  A  =  [Ai---Ai\  with  M  x  N  blocks  Ak,  we 
define  the  block-coherence  of  A  to  be 

1 

Hb(A)  :=  ^2 

iv 

Notice  that  Definition  11  does  not  account  for  pairs  columns  within  a  common 
block  of  A.  These  columns  are  considered  by  sub-coherence: 

Definition  12  (Sub-coherence).  For  A  =  [A\  ■  ■  ■  Afi\  with  MxN  blocks  Ak,  we  define 
the  sub-coherence  of  A  to  be 


where  and  are  columns  of  Ak. 

Definitions  11  and  12  lead  to  a  block  version  of  the  coherence  condition  [16]: 

Theorem  12.  Let  z  G  Cp  be  a  K -block  sparse  vector  with  block  length  N  =  P/L 
and  w  =  Az  for  A  €  cMxP  with  columns  of  unit  2-norm.  For  BOMP  to  recover  z,  it 
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suffices  that 


KN  < 


1 

2 


+  N-(N-1) 


HA)  \  ' 

Tb(A) ) 


Furthermore,  BOMP  converges  to  z  in  at  most  K  steps,  a  vast  improvement  on 
the  KN  steps  that  would  be  required  with  OMP. 


4.4  Performance  guarantee  for  fast  hyperspectral  imaging 


In  this  section,  we  will  show  that  the  sensor  model  in  Section  4.1  satishes  the  con¬ 
ditions  in  Theorem  12  for  recovery  using  block  orthogonal  matching  pursuit  (BOMP). 
We  begin  with  a  definition: 

Definition  13.  A  matrix  V  G  {0,  l}^xi  is  said  to  be  an  (f?,  7) -incoherent  design  if 

(i)  every  column  of  V  has  exactly  R  ones. 

(ii)  every  pair  of  columns  has  at  most  7  ones  in  common. 

The  following  is  the  main  result  of  this  section: 


Theorem  13.  Take  A  as  defined  in  (28),  where  each  vq  is  given  by  a  different  row  of 
an  (1?,  7)  -incoherent  design.  The  BOMP  algorithm  will  recover  any  K -block  sparse  z 
from  w  =  Az  provided 


K  < 


1 

2 


The  proof  of  Theorem  13  relies  on  the  following  lemma: 


Lemma  14.  Take  A  as  defined  in  (28),  where  each  vq  is  given  by  a  different  row  of 
an  (1?,  7)  -incoherent  design.  Then 


Tb(A)  < 
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Proof.  Note  that  for  i  ^  j,  we  have  from  (27)  that 


AT  =  E(A,)f(AJ  j 

fc=l 

Q 

=  Y^(T-1BvkmT(^-1Bvk(j)I) 

k=  1 

=  BTT-{i~1)Tj-1B  =  {Vi,Vj)BTTi-iBr 

where  V)  denotes  the  ith  column  of  the  (. R ,  7)-incoherent  design  matrix  V.  Then 

llATlh^  =  I|(uu>bt^-b||2^2 

<  l{K,r,)ll|BT||2^||T-J||^2||B||2^  =  |(UU)I  <  7, 

where  the  last  inequality  uses  the  fact  that  any  pair  of  columns  of  V  have  at  most  7 
ones  in  common.  Definition  11  then  gives  the  result: 

pB(A)  =  -^max\\AjAj\\2  <  □ 

We  can  now  prove  the  main  result: 

Proof  of  Theorem  13.  Note  that  BOMP  recovers  z  from  w  =  Az  if  and  only  if  it 
recovers  z  from  w  =  {l/y/R)Az.  As  such,  it  suffices  to  check  that  A  :=  (1  /\/rR)A 
satishes  the  conditions  of  Theorem  12.  First,  A  has  columns  of  unit  2-norm  due  to 
the  scaling.  Since  A  is  composed  of  orthogonal  columns,  we  have  u(A)  =  0.  As  such, 
it  suffices  to  have 


NK  < 


1 

2 


1  /  R 

2  Ub(A) 


Lemma  14  then  gives  the  result. 


□ 
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At  this  point,  we  note  that  Theorem  3  in  [34]  gives  a  construction  of  a  Q  x  L 
(. R ,  l)-incoherent  design,  where  R  is  the  smallest  integer  satisfying  L  <  4 R2  log2  (2 R) 
and  Q  :=  \AR2  log(4i?)~| .  As  such,  the  number  of  exposures  is  Q  =  0(L/ log  R)  = 
0(L/  log  L),  and  we  can  have  K  =  0(R )  =  0(\/L/  log  L).  We  note  that  this  number 
of  exposures  is  a  vanishing  fraction  of  L,  i.e.,  the  number  of  exposures  required  by 
spatial  scanning.  As  such,  this  imaging  system  is  legitimately  compressive.  Also,  the 
number  of  active  blocks  we  allow  scales  according  to  a  square-root  bottleneck,  which 
we  should  expect  from  a  coherence-based  guarantee. 

4.5  Simulations 

In  this  section,  we  demonstrate  the  utility  of  BOMP  when  applied  to  spectroscopy. 
We  can  break  down  our  simulations  into  three  steps:  (1)  simuating  of  the  data  cube, 
(2)  simulating  of  the  sensor,  and  (3)  recovering  the  data. 

First,  we  produced  simulated  data  according  to  the  description  in  Section  4.1  (see 
Appendix  B  for  MATLAB  code).  In  particular,  we  created  a  1  x  100  x  100  data  cube 
X  of  stars  in  space.  For  our  simulation,  spectral  bands  range  from  400  nm  to  895  nm 
at  increments  of  5  nm.  The  temperature  T  of  a  star  is  chosen  uniformly  at  random 
from  a  range  of  4000K  to  10000K.  Each  star  is  simulated  with  ten  absorption  lines, 
each  at  random  wavelengths  { Ai, . . . ,  Aio}  drawn  uniformly  from  the  range  of  spectral 
bands.  These  absorption  lines  are  simulated  with  negative  Lorentzian  distributions 
as  in  (29),  where  a*,  and  [3k  are  height  and  width  parameters  respectively  for  the  kth 
absorption  line.  Specifically,  ag.  is  chosen  uniformly  from  a  range  of  0  to  2  nm  and  [3k 
is  chosen  uniformly  from  a  range  of  0  to  L(Xk,T)/2,  where  L(Xk,T)  is  the  spectral 
radiance  at  wavelength  Xk  from  Definition  9.  Figure  8  is  an  example  of  the  star  data 
simulated  in  MATLAB.  Throughout  the  simulations,  we  take  the  block  sparsity  level 
to  be  K  <3,  which  is  reasonable  considering  there  should  not  be  many  stars  in  any 
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Figure  9.  The  proportion  of  150  randomly  generated  1  x  100  x  100  data  cubes  X  which 
are  recovered  using  BOMP  (see  Appendix  B  for  MATLAB  code).  The  solid  line  shows 
the  proportion  of  data  cubes  with  block  sparsity  level  K  =  1  that  were  recovered,  the 
dotted  line  shows  the  proportion  for  K  =  2,  and  the  dash-dotted  line  illustrates  the 
K  =  3  case. 


given  1  x  100  x  100  data  cube  slice.  Once  X  is  generated,  we  create  a  block  sparse 
vector  z  from  X  by  reshaping. 

Next,  we  simulate  taking  Q  exposures  with  the  sensing  platform  using  the  mea¬ 
surement  matrix  A  described  in  Section  4.1.  We  let  the  mirrors  in  the  MMA  have 
random  orientation,  meaning  the  entries  of  V  are  iid  Bernoulli  with  some  probability 
parameter  p.  As  such,  the  columns  of  V  will  have  different  numbers  of  ones,  and  so 
V  will  not  be  an  incoherent  design  as  defined  in  Section  4.4.  Still,  we  take  inspiration 
from  Theorem  13  by  taking  p  to  be  somewhat  small.  Indeed,  each  column  will  tend 
to  have  about  Qp  ones,  and  pairs  of  columns  will  tend  to  have  about  Qp 2  ones  in 
common.  Thus,  the  natural  proxy  for  R/y  is  (Qp) / (Qp2)  =  1/p,  and  so  Theorem  13 
suggests  that  we  should  take  p  small  in  order  to  sense  signals  with  large  block  sparsity 
K .  For  our  simulations,  we  take  p  —  1/4. 

Finally,  after  generating  w  =  Az,  we  use  BOMP  as  defined  in  Section  4.3  to  get 
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a  solution  z.  In  these  simulations,  A  is  Q(N  +  L  —  1)  x  LN  =  199 Q  x  10000,  and 
we  let  Q  range  up  to  30.  As  such,  A  is  too  large  to  implement  naively  on  a  standard 
desktop.  We  therefore  hard-coded  how  each  of  the  blocks  of  A  (and  their  transposes) 
act  on  a  given  vector  (see  MATLAB  code  in  Appendix  B).  This  led  to  significant 
speednps  in  calculating  w  =  Az  and  in  running  BOMP. 

Number  of  exposures  for  recovery. 

In  the  first  simulation,  we  generated  data  cubes  with  varying  sparsity  levels  K  and 
observed  the  performance  of  BOMP  using  Q  e  {1, . . . ,  30}  exposures.  For  each  data 
cube,  BOMP  ran  for  K  iterations.  For  each  value  of  0.  we  performed  BOMP  on  150 
random  data  cubes  to  produce  the  estimate  z,  and  we  declared  successful  recovery  if 
|| z  —  z\\2  <  10”14.  Figure  9  illustrates  the  proportion  of  trials  that  were  successful  for 
block  sparsity  level  K  =  1  (solid  line),  K  =  2  (dotted  line),  and  K  =  3  (dash-dotted 
line).  As  we  can  see,  it  suffices  to  have  Q  =  13  exposures  for  successful  recovery  when 
K  =  1.  Also,  90  percent  of  the  data  cubes  with  block  sparsity  K  =  2  and  K  =  3  are 
successfully  recovered  from  22  and  30  exposures,  respectively. 

The  effect  of  noisy  measurements. 

In  the  second  simulation,  we  add  Gaussian  random  noise  to  our  measurements. 
Specifically,  we  measure  a  data  cube  with  block  sparsity  level  K  =  3  and  Q  =  30 
exposures,  resulting  in  measurements  of  the  form  w  =  Az  +  e,  where  e  is  a  vector 
composed  of  iid  Gaussian  random  variables  with  mean  zero  and  standard  deviation 
0.008.  Figure  10(a)  shows  the  true  spectral  radiance  of  one  of  the  stars  in  z,  and 
Figure  10(b)  shows  the  spectral  radiance  of  the  reconstruction  using  K  iterations 
of  BOMP.  As  we  can  see,  shallow  absorption  lines  (located  at  wavelengths  560  nm, 
815  nm,  and  825  nm  in  our  example)  are  drowned  out  by  the  noise.  In  practice,  one 
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Figure  10.  (a)  The  spectrum  of  one  of  the  three  simulated  stars,  all  of  which  were 

measured  in  the  presence  of  Gaussian  noise,  (b)  Reconstruction  of  the  same  star’s 
spectrum  using  BOMP.  Notice  that  the  three  most  shallow  absorption  lines  (at  560, 
815,  and  825  nm)  were  lost  in  the  noise. 


might  overcome  this  apparent  loss  of  information  by  fitting  spectra  of  chemicals  to 
the  absorption  line  locations  and  depths.  Indeed,  the  star’s  chemical  composition  is 
precisely  the  objective  of  spectroscopy. 
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V.  Conclusions  and  future  work 


In  this  chapter,  we  summarize  our  results  and  discuss  future  research. 

5.1  Discrete  uncertainty  principles  and  applications 

In  Chapter  II,  we  discussed  the  0-norm  uncertainty  principle  and  gave  a  concise 
proof  of  the  Donoho-Stark  uncertainty  principle  developed  in  [8].  Additionally,  we 
introduced  and  characterized  equality  in  a  new  mixed-norm  uncertainty  principle. 
We  also  showed  that  a  discretized  and  periodized  Gaussian  achieves  near  equality  in 
both  the  Donoho-Stark  and  mixed-norm  uncertainty  principle.  In  Chapter  III,  we 
applied  these  results  to  demonstrate  a  fundamental  limitation  in  signal  demixing  and 
to  detect  K -sparse  signals  from  only  O(K)  measurements. 

For  future  work,  it  would  be  interesting  to  apply  this  new  uncertainty  principle  to 
the  fundamental  problem  of  compressed  sensing  with  a  partial  Fourier  matrix  [35,  36]. 
Specifically,  how  many  random  rows  M  of  the  N  x  N  discrete  Fourier  transform 
matrix  are  necessary  to  satisfy  the  (K,  5) -restricted  isometry  property?  Recently, 
Bourgain  [36]  demonstrated  that  M  =  0§(K  log3  N)  rows  suffice,  whereas  Gaussian 
matrices  only  need  M  =  Os(K  \og(N/K ))  [6].  Recall  that  in  Section  3.2,  we  used 
our  uncertainty  principle  to  show  that  if  ||a;||o  <  K  and  ||o:|||  =  N/K ,  then  a  random 
partial  Fourier  matrix  A  satisfies  ||7hr||i  3>  0  with  high  probability  provided  M  = 
O(K).  Since  the  RIP  problem  is  so  similar,  we  suspect  that  similar  ideas  might  apply. 

5.2  Fast  hyperspectral  imaging 

In  Chapter  IV,  we  discussed  the  hyperspectral  imaging  problem  in  the  context  of 
spectroscopy  of  stars.  We  briefly  described  conventional  hyperspectral  platforms  and 
their  limitations  before  proposing  a  compressed  sensing  platform  that  can  quickly 
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sample  hyperspectral  data.  Using  combinatorial  designs,  we  built  coded  apertures 
that  can  be  used  to  quickly  collect  hyperspectral  data,  and  from  which  we  can  quickly 
recover  the  desired  imagery  using  block  orthogonal  matching  pursuit. 

For  future  work  in  this  direction,  we  are  particularly  excited  about  real-world 
implementation.  This  will  bring  its  own  challenges,  such  as  calibration  and  non- 
Gaussian  noise,  but  we  suspect  the  theory  is  robust  to  such  deviations,  and  it  would 
be  interesting  to  explicitly  include  them  in  the  theory. 

Also,  we  currently  use  random  coded  apertures,  but  with  each  exposure,  the  micro¬ 
mirrors  need  to  mechanically  change  their  orientation.  In  the  real  world,  these  changes 
in  orientation  cost  energy,  which  is  a  limited  resource  in  remote  sensing  platforms. 
As  such,  one  possible  area  for  future  research  would  be  to  develop  incoherent  designs 
for  which  changes  in  orientation  are  minimized. 
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Appendix  A.  Supporting  material 

1.1  Harmonic  analysis  background 

For  this  section  define  x  as  in  Section  2.1.  The  following  claim  helps  establish  the 
assumptions  needed  for  Definition  1. 

Claim  1.  The  dual  of  G,  G  :=  {x}  is  an  abelian  group. 

Proof.  Let  gx,g2  EG  and  Xi,  X/2  G  G, 

xiX2[gi  +  g-i\  =  xi[gi  +  +  g-i\ 

=  Xi[gi\xi[g2]x2[gi\x2[g2\  =  xi[gi]x2[gi\xi[g2\x2[g2]  =  xiX2[gi]xiX2[g2}- 
Therefore,  X1X2  G  G  so  G  is  closed.  Define  e{g )  :=  1  \/g  E  G.  Then 

e[gi  +  ^2]  =  1  =  1  •  1  =  e\gi]e\g2], 

meaning  e  E  G,  and  so  G  has  an  identity  element.  Define  <p[g\  :=  1  /x[g\  Vg  £  G 

<^[91  +  92]  x^1  +  g^  x[gi]x[g2\  (x[si])  (xfel) 

Therefore,  <p  E  G.  Let  g  be  an  arbitrary  element  of  G.  Then 

(</>x)\g]  =  4>[g]x[g]  =  -rrxfc]  =  i  =  e\g], 
x[g] 

and  so  <f>  =  x_1-  Thus,  G  is  a  group.  Lastly, 

{xiX2)[g\  =  xi[g]x2[g\  =  X2[g\xi[g\  =  (X2X1M, 
and  so  G  is  an  abelian  group.  □ 
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The  following  Claims  help  establish  that  Gk  =  Gk. 

Claim  2.  For  all  y  e  G,  22geGx[g\  =  0. 

Proof.  Let  y  G  G  and  fix  g2  G  G  such  that  x[92\  7^  L 

x[9i]x[92]  =  x^91  +  92]  =  X[9i}- 

gi£G  9iS  G  91  £G 

where  the  last  equality  is  true  by  the  closure  of  G.  Rearranging, 

x[gi\  (xb]  -  1)  =  0. 

91 SG 

We  know  yb]  —  1  7^  0  because  (72  is  arbitrary  so  YlgeG  xb  =  0.  □ 

Because  22  x[g]  =  0  we  can  show  that  G  consists  of  orthogonal  elements  of  £(G,  C). 
In  fact,  G  is  an  orthonormal  basis  for  £(G,  C). 

Claim  3.  {x}xeg  is  an  orthonormal  basis  for  £(G,C). 

Proof  Let  y  G  G.  First,  we  will  show  that  ||x||2  =  1  Vy  G  G. 

\7PIX’  7WX)  =  iF§x[9lx[91  =  W\  § 1x191,2  =  =  L 

Therefore,  the  statement  is  true  for  an  arbitrary  y  so  it  is  true  for  all  y  G  G.  Addi¬ 
tionally,  the  above  shows  that  y [g]  =  y_1[<?].  Second,  we  will  show  orthogonality.  Let 
X,ip  be  arbitrary  elements  of  G,  such  that  y  7^  if.  Then, 

{vWf'  TWf)  =  IF  lb91*1  =  W\  = 

The  last  inequality  comes  from  the  fact  that  y^-1  G  G  and  Claim  2.  Therefore 
{x}g&G  orthonormal.  We  know  that  £(G,C)  is  a  \G (-dimensional  vector  space 
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and  that  |G|  =  |G|.  Therefore,  {x}xeQ  sPans  I(G,C).  Thus,  {x}xeg  is  a  basis  for 
i{G,  C).  □ 

Claim  4.  If  x  1,  •  •  ■  ,  Xk  e  G  ihen  Xi  ■  ■  ■  0  Xfc  e  Gk. 

Proof.  By  definition, 

(Xi®  •••®Xfc)[(0i,02,--  -  +  >0fc)]  =  (Xi  ®  —  <8)  Xfc)  |>i  +g'v-  ,9k  +  9k ] 

=  Xi[5i  +  ^i]"-Xfc[5fc  +  ^fc] 

=  xiklxibi]  ■  •  •  Xfc[^]xfebfc] 

=  (xik]  •••Xfcbfc])(xibi]  •••XfcbfeD- 

The  last  equality  is  the  definition  of  (xi  ®  •  •  •  0  Xfe)  [9i ,  •  •  •  ,  9k\  (xi  ®  •  •  •  ®  Xk)  [g[  •  •  •  <?£] . 
Therefore,  Xi  ®  •  •  •  ®  Xfc  G  Gfc.  □ 

We  have  thus  proven  that  {xi  ®  X2  ®  •  •  •  ®  Xk}  C  but  we  can  actually  prove 
set  equality. 

Claim  5.  For  xi,  X2,  •  •  •  ,  Xk  e  G,  {xi  ®  X2  ®  ■  ■  ■  ®  Xk }  = 

Proof.  We  need  to  show  that  #{xi  ®  X2  ®  •  •  •  ®  Xk}  =  \ Gk \ .  Hence,  it  suffices  to  show 
that  each  Xi  ®  X2  ®  •  •  •  ®  Xk  is  unique.  Assume  xi  ®  X2  ®  •  •  •  ®  Xk[hi,  h2,  •  •  •  hfe]  = 
Vh®^®- ,hk\  for  all  [/ii,  /i2,  —  ,  hfc]  e  Gfc.  Let  \g1,g2,---  ,9k]  e  Gfc 
such  that  gm  =  g  E  G  and  gn  =  0  for  all  n  7^  m.  Therefore,  by  definition 

(xi®X2®  ■  ■  ■  ®  Xk)\gi, 92,  ■  ■  ■  ,9k]  =  (ipi  ®  fa  ®  ®  4>k)[gi, 92,  ■  ■  ■  ,9k] 

Xi[gi}x2[g2\  ■  ■■Xk[gk\  =  M9i]M92]  ■  ■  ■tfklgk] 
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Plugging  in  our  definition  of  (gi,  g2,  ■  ■  ■  , gk)  we  get 

xi  [0]  X2  [o]  •  •  •  Xm  [g]---  Xk  [o]  =  [o]^2  [o]  •  •  •  ^ ^  [o] 

Xm[g\  =  m[g\ ■ 

Since  g  and  m  are  arbitrary,  this  statement  is  true  for  all  g  €  G  and  m  <  k.  Thus, 

Xi  ®  X2  ®  ®  Xfc  =  Vh  ®  V>2  <8>  *  •  •  ®  So  each  xi®,  X2  ®  •  •  •  <£>  Xk  is  unique.  □ 

By  Claims  3  and  5  we  see  that  Gk  =  Gk. 


1.2  Poisson  summation  formula 


The  use  of  the  Poisson  Summation  Formula  relies  on  specific  assumptions  regard¬ 
ing  /.  It  suffices  that  the  function  /  is  in  Schwarz  Space.  A  consiquence  of  /  e  S  is 
that  f(x)  <  C/x2  for  some  C  >  0.  A  quick  review  of  the  bounded  convergence  theo¬ 
rem  will  be  useful  as  we  build  the  function  that  achieves  near  equality  in  Theorems  2 
and  6.  The  theorem  below  is  a  corollary  to  the  bounded  convergence  theorem. 

Lemma  15.  Assume, 

OO  N 

f(x)  =  hM,  fN  (x)  :=  h(x  +  n ) 

n=— 00  n=—N 

and,  converges  pointwise  to  f .  Then 

lim  [ 

N^-oo  J q 


fN{x)dx  =  /  f(x)dx. 


Proof.  By  the  triangle  inequality  and  Definition  2,  we  know  that 


I.My)l=  Hx  +  n )  <  \h(x+)\< 


(x  +  n)2  +  1 


(x  +  n )2  +  1 
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We  can  break  the  sum  up  into  negative  and  nonnegative  indices.  Hence 


n  i  r  -1  i  N  ^ 

(j  v _ _  (j  w _ L  _ 

“  r  (x  +  n)2  +  1  “  (x  +  n)2  +  1  ^  (x  +  n)2  +  1 


Since  x  G  [0, 1], 


r  -1  x  N  i  i  r  -1  x  N  i 

^  ^  (x  +  n )2  +  1  ^  (x  +  n)2  +  1  ~  C  ^  (n  +  l)2  +  1  ^  n2  +  1 

_n=—N  v  ’  n=0  v  ’  J  _n=—N  v  ’  n=0 


Let  n  :=  —  n.  Then 


"  — 1  AT  1  r  AT  N 

C  V  1  i  V  1  =c  V  1  i  V  1 

^  (n  +  l)2  +  1  ^  n2  +  1  ^  (n  —  l)2  +  1  z--'  n2  +  1 

_n=—N  v  7  n=0  J  n=l  v  7  n= 0 


Combining  terms  gives  us 


r  n  n  nr  n  , 

r  X'  1  'X  1  -  r  wf  1  1 

6  ^  (n -  i)2  + 1  +  ^  XXI  _C  1  +  ^  V(n-i)2  +  i  +  ^TT 

_n=l  v  n=0  J  L  n=l  x  v  7 


Since  n  >  0, 


y  /  i  ^  \  ^  ^  ^ 

C  1  +  ^  V(n-1)2  +  1  +  ^2Tl)  -C  1  +  2  ^  (n  -  l)2  +  1 

n=l  N  x  7  7  n=  1  v  7 


Let  n'  —  n  —  1 .  Then  by  change  of  variables, 


1  "  pN—1  ^  "  roo  -y 

C  1  +  2  ^  t  ,\9  ,  i  — ^  1  +  2  +  2/  dy  <  C  3  +  2  /  dy 

w)  +i  L  ./o  y2  + 1  J  L  Jo  y2  +  l 


Evaluating  the  integral  then  gives 


1*00  1  "  OO- 

C  3  +  2  /  — - dy  =  C  3  +  2  arctan  (y)  =  C  [3  +  7r]  =:  M. 

Jo  r  +  i  .  .  n  . 


63 


Therefore,  3 M  G  M  such  that  |/jv(y)|  <  M.  Thus,  by  the  bounded  convergence 
theorem, 


fn(x)dx  =  f  f(x)dx  □ 

Jo 

Another  tool  necessary  to  prove  the  Poisson  summation  formula  is  Lemma  16 
below. 


lim 


■N— >oo 


Lemma  16.  If  f  E  S,  then 


y. 

nEZ 


f  (x  +  n)  e 


—27 rikx 


dx  = 


z _ / 

nEZ  ' 


f  (x  +  n )  e 


—2ivikx 


dx. 


Proof.  Define  Hn{x)  :=  fpf(x)e  2mkx  where  $n{x)  is  defined  as  in  Lemma  15.  We 
know  that  hjq{x)  converges  pointwise  to  f(x)e~2mkx.  Thus 


\hN(x)\  =  \fN{x)e~2nikx\  =  \fN(x)\\e-^ikx\  <  M. 


Therefore,  by  Lemma  15 


lim 


•TV— >-00 


p 1  N  p\ 

/  ^2  9  (x  +  n)  e~2mkxdx  =  22g(x  + 

J°  n=—N  J°  neZ 


n)  e2mkxdx.  (33) 


Additionally,  we  know 


pi  p  1  N  N  p  i 

/  hN(x)dx  =  /  22  9  (x  +  n)  e~2mkxdx  =  22  / 

^0  ^0 7V  T  _ AT  J  0 


g  (x  +  n)  e~2nikxdx.  (34) 


n=—N 


n=—N ' 


Taking  the  limit  of  both  sides  of  (34)  we  get 


pi  N  pi 

limAr^oo  /  ^2  g(x  +  n)  e~2mkxdx  =  22  I 

J°  n=—N  nGZ 


g  (x  +  n )  e  2mkxdx  (35) 
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Therefore,  by  (33)  and  (35)  yields, 


=  X  /  9(x  +  n)e-2*ikxdx=  f  ^  g  (x  +  n)  e2nikx dx .  □ 

n£Z  n£Z 

With  Lemma  16  we  have  a  sufficient  foundation  to  prove  the  Poisson  summation 
formula  [37]. 

Theorem  14  (Poisson  Summation  Formula).  For  some  g  E  S, 


X 9[n \  =  X  ( Fr 9)  ik] 

nGZ  /cEZ 

Proof.  Define  f(x)  :=  ^2nez9(x  +  n)  and  as  the  Fourier  transform  from  M  to  Z. 
Then, 


1  f27T 

X  g(x+n)  =  (Fj1  Fjf)(x)  =  V  (FTf)  [k]e2*ikx  =  V  —=  f(x)e~2mkxdxe2”ikx 

nez  fcez  fcez  ^  2?r 


Substituting  in  our  definition  of  f(x)  yields 


1  /*27t  _  1  p2tv 

V  -=  /  f(x)e-2’ik‘dxe2’ik*  =  J2  -7x=  / 

fcez  v  27T  Jo  fcgZ  V  27T  Jo 


X^  + 


n 


.nEZ 


EiEf 

i.c’y  v2tt  ™  Jo 


g— 2nikx  ^2nikx 

(36) 


n)e~2lzikxdxe2lzikx 


/cEZ  v  nEZ 


Where  the  last  statement  is  true  by  Lemma  16.  Substituting  u  —  n  +  x  into  (36) 
gives  us 

^ ^  ^  p27r-\-n 

(37) 


-1  _  p2TT-\-n 

X^X  /  g(u)e-2nik{u-n)due2nikx. 


k&Z  nez  Jn 
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We  know  that 


r»27 r+n  _  p27r-\-n  /»oo 

0—2,Kik{u—n)rj  _  \  A  /  „(n.\ „—2nik „27riknjn,  _  /  „(n.\ 2'niku  r 


rZn-\-n  _ pZn-\-n 

Y  /  g(u)e-2mk^du  =  Y  9i 

nGZ  ^ n  n&L  ^ n 


u)e~2mke2mkndu  =  /  g(u)e~2mkudu. 


because  e2mn  =  0.  Thus,  (37)  is  equivalent  to 


E 

fcez 


g(u)e-2”ikudu 


[V^J-c 


=  V(FR«?)[fc]e2 


fcez 


where  the  last  equality  is  by  the  definition  of  F^g  .  Therefore, 


Ya(x  +  n)  =  J^(.Fr0)[A;]« 


2i rikx 


ke  z 


Evaluating  both  sides  at  x  =  0,  we  get 


Y9^  =  ^2^3)  [k]- 

n€Z  k£Z 


□ 


Theorem  14  will  be  useful  for  finding  the  Fourier  transform  pair  needed  for  near 
equality. 

Corollary  1.  For  all  f  G  S, 


Yf(p~s)  = 


—2iriqs 


p£Z 


q£Z 


Proof.  Let,  h(x)  :=  f(x  —  s ).  By  definition, 


(FRh)  (x)  :  =  /  h(t)e~2mxtdt. 


(38) 
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Substituting  if  —  t  —  s  into  (38)  yields 


r*00  pOO 

—  27TlXt 


h(t)e~2nixtdt  —  /  f(t- s)e~2nixtdi  =  /  f(t')e-2^xit'+s)dt'. 


We  know  that  ya+b  =  yayb.  Therefore 


f(t')e-2nix^+sUt'  =  e~27rixs  /  f(t')e-2nixt'dt'  =  e~2™s  (FRf)  (x) .  (39) 


By  Theorem  14, 


=  ^2(FRh)  (, q ). 

pEZ  O'EZ 


Substituting  in  our  definition  of  h  and  the  result  of  (39), 


=  Y:  W)  (q)( 


,—2  nqs 


p£Z 


q&j 


□ 


1.3  Properties  of  the  Gaussian 


Claim  6.  for  all  y  G  C 

/OO 

e~y2  dy  =  yfr 

-OO 

Proof.  Because  a:  =  Vx2,  we  can  rewrite  (40)  as 


(40) 


^oo  poo  \  1/2 

e~y2 d.y  =  (  /  e~y2 dy  /  e~x2 dx  j  = 


r»oo  /»oo  \  1/2 


'  — OO  */  — OO 


e-(y2+x2)dydxj  . 


Converting  to  polar  coordinates  yields 


'•oo  poo  \  1/2 

..2_l«2'  ' 


r  — oo  — OO 


e-(y  +* 

/  WO  ^0 


27t  /»oo  \  1/2 

/  e~r2rdrd$\ 
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Let  u  :=  r2.  Then  by  change  of  variables 


r»27r  poo 


1 0  Jo 


)l/2  r2n  POO  \ 1/2  /  /•oo  \  1/2 

=  (  -  /  /  e“ttdud0  )  =  (  7T  /  e"udu  )  =  V7T 


'0  ./  0 


y 


'0 


y 


which  implies  the  result. 


□ 


Given  the  integral  of  e  v~  we  can  easily  compute  F(e  y~). 


Claim  7.  The  Fourier  transform  of  f{x)  :=  e  x2  is 


(Fm)=e-’  Z  sfii 


Proof.  By  definition  of  Fourier  transform, 


(Ffm  = 


~2^xdx  = 


~(x+2iri£x)  _ 


D—  (x+nit;)2 —tt2  T* 


dx 


where  the  last  equality  is  from  completing  the  square.  Let  u  :=  —xnif. 
Since  /  is  analytic, 


e-(x+TTi£)2—K2t2  dx  =  e-n2e  /  e-n2du  = 


where  the  last  inequality  is  clue  to  Claim  6.  □ 

One  specific  property  of  the  Fourier  transform  pair  defined  in  Claim  7  is  the 
property  of  dilation. 

Claim  8.  If  h(x)  :=  f(mx)  for  some  function  f ,  then 

( Ff ) 

(Fh)(g)  = 

m 
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Proof.  By  Definition  of  F  :  C  — >  C  and  h, 


(Fh)(0 


h{x)e~2nixxdx 


f(mx)e~27Tixxdx  = 


i  r°° 

—  /  f(u)e~2m^™du 

m  J- oo 


where  the  last  equality  comes  from  change  of  variables  where  u  :=  x/m.  Rearranging 
gives  us  the  result: 


1  /*°° 

-  /  f(u)e~2niu™du 

m  J- oo 


□ 


1.4  Supporting  claim  for  Section  2.4 


Claim  9  (Geometric  Sum  Formula).  Suppose  q,k  €  Z.  Then 


n—  1 


[g27ri(g-fc)/ra' •?  _ 


n 


'  n  ‘ 


j=o 


0,  otherwise . 


Proof.  Without  loss  of  generality,  assume  that  e27™*-9  7^  1.  Then 


n—  1 

|^e27rj(g-fc)/nj  J  _ 

j=0 


j2ni(q—k)/n 


'  n  —  1  g2 ni(q-k)  _  ^ 


1-1 


27T  i(q  —  k)  27T  i(q  —  k)  2Tri(q—k ) 

e  n  —  1  e  n  —  1  e  n  —  1 


=  0. 


If  e2m(<?  fc)/n  _  ^  then 


n— 1  n— 1 

^  j-e2,rt(g-fc)/njJ  =  ]_  =  n 

j=0  7=0 


We  know  that  e27r^9  fc^n  =  1  if  and  only  if  TA  g  Therefore, 


n— 1 


j"e27ri(q-fc)/n' J  _ 


n 


,  sdez, 


7=0 


0,  otherwise , 


completing  the  proof. 


□ 
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Appendix  B.  MATLAB  code  for  Chapter  IV 


Data  simulation 

The  following  MATLAB  code  simulates  the  data  as  described  in  Section  4.5. 

function  x=stardata() 

%  author:  2d  Lt  Megan  Lewis 

1  description:  outputs  the  data  that  simulates  a  "slice"  of  the  sky 

%  with  K  stars 

M=100;  l  number  of  locations 

L=100;  l  number  of  wavelengths 

Fslice=zeros(M,L) ; 

permindices=randperm(M) ; 

K=2;  %  number  of  stars  (K-sparse) 
list=permindices (1 : K) ;  %  locations  of  stars 


%  first  we  build  a  slice  of  the  data  cube 
for  kk=l:K 

h=6 . 6260*10" (-34) ;  %  planck’ s  constant 

c=2 . 9979*10"8 ;  %  meters  per  second 

k=l . 3806*10" (-23) ;  %  boltzmann’s  constant 

T=rand*6000+4000 ;  %  Kelvin 

lambda=(400 : 5 : 895) *10" (-9) ;  %  meters 

stuf f =h*c . / (lambda*k*T) ; 

radiance=2*h*c"2 . / (lambda. "5.*(exp(stuff)-l)) ; 
radabs=radiance ; 
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lines=10;  °/0nuraber  of  absorbtion  lines 
for  ell=l: lines 

index=ceil (rand*100) ; 
lambdaO=lambda(index) ; 
width=rand*2*10~ (-9) ; 
height=rand*radiance (index) /2 ; 

radabs=radabs-height . / (l+(lambda-lambdaO) . ~2/width~2) ; 
end  °/0for  loop 

radabs=radabs/norm(radabs) ; 

Fslice (list (kk) , : )=radabs; 
end  %for  loop 


x=Fslice ’ ; 


end  /(function 

Runme  file 

The  following  MATLAB  code  was  used  in  order  to  perform  block  orthogonal 
matching  pursuit  (BOMP)  for  simulated  data  z  and  to  evaluate  if  BOMP  successfully 
reconstructed  z. 


1 - 

%  author:  2d  Lt  Megan  Lewis 

%  description:  The  purpose  of  this  .m  file  is  to  run  all  the  functions 
1  necessary  to  generate  test  data,  perform  BOMP  and  evaluate  the 
7„  performance  of  the  algorithm. 
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“/Initialization 
n=0 ; 

°/„loop  that  controls  the  number  of  masks  or  measurements, 
for  numasks=l : 30 

“/loop  that  controls  the  number  of  iterations  per  given  number  of  masks, 
for  runs=l:30 
n=n+l ; 

“/Simulates  the  data  that  BOMP  is  trying  to  recover 
z=stardata(3) ; 

[nz,  zs]=efficient_bomp(z,numasks,  3); 
stars3(n)=norm(zs-nz) ; 
end  “/run  count 
end  °/num  masks  loop 

BOMP 

The  following  MATLAB  code  runs  BOMP  by  hard-coding  the  model  matrix, 
function  [nz,  zs] =eff icient_bomp(z,  Q,  star) 

“Z - 

“/„  Author:  2d  Lt  Megan  Lewis 

“/„  Purpose:  To  do  Block  orthogonal  matching  pursuit  in  a  timely  manner. 

”Z - 

“/  INPUT 

“/„  z=vectorized  data  cube 
“/„  k=sparsity  level  of  z 
“/„  Q=number  of  exposures 
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1 - 

1  OUTPUT 

*/.  nz=recovered  z 

l - 

*/.  INITIALIZATION 
[L,  N] =size(z) ; 

zs=reshape(z, [numel(z) , 1] ) ;  %turns  data  cube  into  data  vector 
w=zeros (Q* (N+L-l) , 1) ;  ’/(initiates  w 

°/„The  incoherent  design  we  will  be  taking  our  mask  values  from: 
V=binornd(l,0.25,C],L)  ; 

l - 

1  Getting  w=Az 
for  q=l:Q 

for  1=1 :L 

w( (q-1) * (N+L-l)+l : q* (N+L-l) ) =w( (q-1) * (N+L-l) +1 : q* (N+L-l) ) . . . 

+V(q, 1) * [zeros (1-1 , 1) ;  zs( (1-1) *N+1 : 1*N) ;  zeros(L-l, 1)] ; 
end  %1  for  loop 
end  %q  for  loop 

% - 

%INITIALIZATION  OF  BOMP 
bestinds=  []  ; 

Id=eye(N) ; 
r=w; 

% - 

1  BOMP 

for  index=l:star 
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%First  find  argmax | | A_i~*r | | _2 
norms=zeros (L , 1) ; 
for  1=1 :L 
temp=0 ; 
for  q=l:Q 

temp=temp+V (q, 1) *r (1+(N+L-1) * (q— 1) : 1+N-1+(N+L-1) * (q— 1 ) )  ; 
end  %q  for  loop 
norms (l)=norm(temp) ; 


end  %1  for  loop 
[val  ind] =max (norms ) ; 
bestinds (index) =ind ; 

%Next  we  calculate  Aind 

Aind=zeros (Q* (N+L-l) , length(bestinds) *N) ; 
for  ii=l : length(bestinds) 
j j=bestinds (ii) ; 
for  n=l : N 

for  q=l:Q 

Aind( (q-1) * (N+L-l) +1 : q* (N+L-l) , (ii-1) *N+n)= . . . 

Aind( (q-1) * (N+L-l) +1 : q* (N+L-l) , (ii-l)*N+n)+V (q,j j )* • 
[zeros (jj— 1,1) ;  Id(:,n);  zeros(L-j j , 1)] ; 
end  %q  for  loop 
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end  %n  for  loop 
end  °/„ii  for  loop 
"/Calculating  ztilde 

ztilde=Aind\w;  "/least  squares  minimization 
7  Updating  the  residuals 
r=w-Aind*ztilde ; 
end  7  index  for  loop 

7 - 

7.  Calculating  nz 
nz=zeros (size (zs) ) ; 
for  ii=l : length(bestinds) 
kk=bestinds (ii) ; 

nz((kk-l)*N+l:kk*N)=ztilde((ii-l)*N+l:ii*N) ; 
end  "/for  loop 

end  %ef f icient_bomp 
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